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METHOD AND APPARATUS FOR T KA< IN C PACKEJ SH ASH-BASED SYS IT Ms 
AND METHODS FOR DETECTING, PREVENTING, AND TRACING NETWORK 
WORMS AND VIRUSES 



Field of the Invention 



worms and viruses, and tracing their paths through a network, 
Descr iption of R elated Art 

Availability of low cost computers, high speed networking products, and readily available 
network connections has helped fuel the proliferation of the Internet. This prol iferation has 
caused the Internet to become an essential tool for both the business community and private 
individuals, Dependence on the Internet arises, in part, because the Internet makes it possible for 
multitudes of users to access vast amounts of information and perform remote transactions 
expeditiously and efficiently. Along with the rapid growth of the internet have come problems 
caused by malicious individuals or pranksters launching attacks from within the network. As the 
size of the Internet continues to grow, so does the threat posed by these individuals 

The ever-increasing number of computers, routers,, and connections making up the Internet 
increases the number of vulnerability points from which these malicious individuals can launch 
attacks. These attacks can be focused on the Internet as a whole or on specific devices, such as 
hosts or computers, connected to the network. In fact, each router, switch, or computer 




he fie l d of n etwork security and, more 
eadsvstems and methods for id e ntifyingdetectm^ 
ismission of a-pft e feet^ n malicious packets, such as 
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connected to the internet may be a potential entry point from which a malicious individual can 
launch an attack erema ely undetected. Attacks carried out on the Internet often 

consist of malicious packets being injected into the network. Malicious packets can be injected 
directly into the network by a computer, or a device attached to the network, such, as a router or 
switch:- "Sue4 - a"et>i^t t t e ^"er - d&v% e- J „can be compromised and configured to place malicious 
packets onto the network. 

|WJ4 most pub l iciz e d forms of n e twork attack s oft e n in volve placing thousands or 

fniHieflS-ef-paek-ets-onto the n e twork using a practice known m fkxxting. The tlood - of-pac-k e ts 
can be target e d to a specific d e vic e on th e network, for examp l e a corporate w e b site, thus 

design e d to clog the links, or conn e ction points, b e tw e en n e twork components. Network attacks 

bogMS-fete r ne t Pro tocol ffi*)-«d& ! &ssefr ; w^ t he pa ck«ts----oi-ig-ins 

imposs&le4o-tfeterm^ 

enflaut ed a tet4Hmpie iel oned t»-6&-tF£HWfamation*-"WMm a pack e t is transfer t n e dr i t 
undergoes a process that chang e s the original packet into a new pocket; as, for example, would 
happ e n du r ing tunneling or network addr e ss translation (NAT). Locating th e origin of a n e twork 
attockis-f ur ti i e r eorof&egted-feeeagse-^a*^^ 
a t taeky - fnid t ipfe-n e twef^ ^ 

A distributed attack is one that is launched essentially simultaneously from several locations 

f0#lj Ne t w^t#a t ^ 

d ^ yH » f e- € ^ j^ ^ 

extremely difficu l t te d e t e ct- s ingle pack e t at^ 

data, currently, must be analyzed aft e r the fact to d e t e rmine if a singl e pack e t attack was th e 
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|003j Muc l u'f the difficultx in ul e nni % - ^e-> becaus e the 

ft# e fft e fr«ffift^ft-a-stflte1es3 roHtit%-«rfr-ft5toe^ 

sol e ly on d e stination addr e sses. Although source IP addr e sses may be transmitted with data, 

«o- w mtlarity4o4he- « e^^ 
t e e 4 i« r ky« ^-a .^^ 

packets at 4m nltimrte d e stination dev i c e rather t hart ■ attempting ' to locate their origin- Such 
tM-ig-ie"i - B"i - &fe«- e d"to - aS ' an entry point, also referred to as an ingress point or imniskm4<mtiwfk 
onto th e n e twork. Failing to identity th e source address of malicious pack e ts inhibits preventing 
fn^hef-a$t£rek%and^tt^ 

Hial ' teiouS ' pa.ek e tS; Tvt'0 prior art. autonomous syst e ms ar e s hown, PASS and PASSy ' r e spectiv e ly ; ; 

et>a tte e - t e 4 -t ^ ^ ^ ^ 

a^rK>nmts-^y^tea^^ 

B4 - H5 for PAS2, resp e ctive l y. An AS is normally connect e d to th e public n e twork by one or 
■ fer t et i ena l ity - . - 

f0#£j Border routers contain routing tables for other"fetrtef$--within4he-A-S - -and4ef 

routers within the public n e twork that ar e conn e ct e d to th e AS by a link, i.e. a communicative 

ee« B » e efe t ear4^HR^r4-rR4 i»al^^ 

fepfe s eM a tive^iA 

« y e- iaed - *»«^ 

d^ir^d-4e s faafkM-ad4r- e ssr 

Firewalls are typically installed between a local area network (LAN), or intranet, 

and-difr-kiter-netr-or-pubye-network' Pkgwal-ls-aet-as-gatelt^ 
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certain packets in whiio excluding other packets. Firewalls may be implemented in routers or 

R-Bb-^»-ftf e -ti6 e d-by-l t f e waU5"to4 e tef»^« e- wh - ie - h - packets will be aiiow^d-tnte-thett-f- ee p e c^v e 
AS-and^ S i ne e -Hifcs-d e i^^^ 

updat e 4-0« HV f e gnIar4K ^ 

$&f\ Additional -prot e ction for an AS may b e obtained by supplem e nting herd e r meters 

and-fe&ways wkfe i etru - vion det e ction systems (IPSs). IDSs also use rule - based algeethnB-to 
det e rmine if a given pattern of network traffic is abnormal. The g e n e ral premise uoed by an IPS 
is4lMt-^4kim^ 

traffic. Using a rule set, an IPS monitors inbound traffic to an AS. Wh e n a suspicious-pattern 

fkewali-to-modif^ 
a€4kms~nmy4nehrt^^ 

a particular s ourc e ■■■ addr ess- ; - or di s card i ng pack e t s addr esse d te a part i enlar d es t i nation- fu f ig ) . 
IPS! is used to protect PAS I and JDS2 is used in conjunction with F \ to protect .PAS2. 
AifflOHgh-befdef--.^ 

th e y r e ly on rui e- based look up tabl e s containing signatur e s of known thr e ats. In addition, 

b i -mie 3 H ; onters r fe 

i ngfess4e^a $ i^»refn^^ 

pae k ets- be^ a i ^ p ^ ^^^ 
tm 4-$ w i fc& es r b e4 e^ ^ 

inferrnaiien about e ach link trayers e d by a pack e t; T-e obta i n t l *i» information^^ 
ret*M i ft -- w^ 

information about, or a copy of e ach packet trav e rsing a network. With high- speed rout e rs 
i^wnggtgahn^t>fdatape i- s e i: > : > nd --T ; ^ - i fe-ttn -- e^o4 - paeket s -4s--not-pt : aetkal-. - 



4 



In re, U.S. 10/654,7? 1 Changes made to 09/88 i ,074 to create 

ClPapp 1 0/25 i, 403 

{009] What has been ne e ded and what has not b e en available is a method for identifying 

fe«%ift-e&ft^^ 

addresses all shortcomings of prior art prot e ction t e chniqu e s. Embodim e nts of the present 

SU MMA RY OF THE INVENTIO N 
pM-0f Embedment s - o-f^^^^ 

into a network. More specifically, in a network including multiple hosts and multipl e roar e rs for 
feei-litating transmission of packets on a network, a system, for example, is employed for 
dete?mir4t^ 4fte - fK^ ^ 

intrusion d e t e ction system isol ates th e maliciou s packet and th e re b y de t e rmin es t h e point of 

ea&y-e &fa e maliciott»iHK?ke ^ 

s«fve«Betedes-a-r«ea«s4^ 

on fr tep-a -way. in stilt a4«dn*rHiHBbe<fe^ 

for e stablishing a bit map of hash valu e s representativ e of packets having pass e d through the 

fesp e e#ve-fenier- r ary--^^^ 

t h e 4Mish - ¥aia e s - ef^ 

fO044J to - a - ft i r t h e r - asp e et - ^ 

where- at-least one of the packets is a target packet, the network includes at least one network 
eenipon e ntra-d e t 

server; A tperyines 
messag&4deftti.fes4^^^ 
feem4fee-4ffsH>etw^^ 
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contained therein. And, the information b used in a manner that allows die omry point of the 
1 001 2 1 fa- yet -a- further aspect of th e invention, in a n e twork carrying a pfarality of 

header p&vtkm 4m\\dm m\ addr e s s of th e network component And, a body perti on 4 mkalm -at 
kaM-afertioft ef-the target packet, th e body portion b e ing compared to corresponding 
representations wher e a match betwe e n a portion of th e target pack e t and on e of th e 

|W43| fes* 4 ti - a - fet^ 

about a subset of the p l urality of pack e ts having pas s ed through th e n e twork compon e nt. -Th e 
tt e t - wofk"€< » H^ 

n et we - rfe- A- data strnc4 : a*eH&efed4r^^ 

A network component id e ntification attribut e corr e sponds to a location of th e n e twork 

a^mj»na&:"-A4afg ^ 

a t tH * H i ( t > - ass < x*a^ 

lBdkateg--thaH^ 

pM4| fe4»a#w^ag e ^ ^^ 

networks; A feri^ 

^m^eteeted-m^kkHts-pack-ete-ffl a network. A stil l , further advan t age of tfee-iaveafaeft-ts-fetf-k 
d^teet s - it» 4 ieim * s - ff^^ 



6 



In re, U.S. 10/654,7? 1 Changes made to 09/88 i ,074 to create 

ClPapp 1 0/25 i, 403 

devices thus enhancing network security. Another advantage of the invention is that it 
effi^ien^-uses-stef^^*^ 

j0015| it is thus a genera i object of th e pres e nt invention to provide improved packet 

|0016| it is another object of th e pr e s e nt inv e ntion to e liminate problems caused by 

m«ik%^i»^€ : fe ete 4« -- a - ft et ¥^Fk~- 

f0017| l-t-is-a-feth e r - ehj e et - ef the pr e s e nt invention to identify malicious packets to 

laeilliate-id e fttil^ 

f001-8j &4»»foi#Ha i -e%f^^ 

rnai ieiens packets wh e n distribut e d attacks are launch e d again s t a ' network . 

jO043| fets-yeta-ft^^ 

krfemat 'i e r v ft he i if^ 

pft2Sj Farther object^ad--aet¥^tage&-e#^ffeg^Hm^^-witl--^0t»e-ia^ 

eenjuneiien ■ with the a ceentpany i ng ■ drawings ' in which: 

BRIEF DESCRIPTION OF THE DRAWINGS 
( 0 0 24 j F - i - g:'44» - a- h^ 

P»g~ - 24» a 44t^k^ ^ 

operating in conjunction with an Int e rn e t network; 

e x - t e rna l -r te twoFk -sf 

{0024} Fig. -A is a flowchart .illustrating an e xemplary method for use with a source path 

i s fc4at - ien - -s e F¥ e r-; 
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{0025} Fig. 5 k a schematic diagram of an ex e mplary data structure for storing 

techniques; and 

{0026J Fig. 6 is a block diagram of a g e neral- purpose computer configurabl-e-for 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENT 

|0937j A - pr e ferred -e mi^ ^ ^ ^ ^ 

neiworkeotnpon e nts. or d e vic e s^ mch m t\ muter within an a utonomeus s ystetn f AS- ) ■ ■ to 
determine the ingress point; or location, for a malicious packet (MPl). Pig. 2 illustrates an 
er*&odir mm4 * ia t- r«a^ 

bfok e a-HHo -t hf e e - g e ncral areas enclosed within borders with communication media., such as 
IMcs^earrying data traffic across th e n et work, conn e cting th e general are as: LiEks-s e rv e as a 
^aasfi»s&«^ 4B e 4kt4 - et^ oomprised-eSwkereptieftl 
fibefy-radk) fr e quency (RF) transpond e rs; or th e - like - . 

(0028} The rightmost portion of Fig. 2 denotes an AS, shown as AS 1 , enhanced by the 

addit i on o f a 

t» . W erk"^ A l se-metad e d 

■widyrHA S l---^ 

host computers H1-H3. I PS ! may take the form of a commercia ll y a va i lal>i e - i;9S-; - er 
a&emat4¥e ly4t-n^y4>e -d^^ 

aftd-« ; te d»dS"-4&S»-a - nd firewalls ate well known in the-^-and-wiU not be desefibed-ift-detail 
herein. An informativ e sourc e of information on IDS and fir e wall functionality that may be used 
with the d i sc l os ed e m b odiments can be found in Firewalls and Interne} Security: Repelling the 



(0029| 


kcr t by William R. Cheswick and Su-wr, M. Bellowin, Addioon We-sle 
SSf4B«y43eeefflpt4sed-efa-^ 


y{1991), 
opomttvoly 


eoapled-4 




^e--perfem* 
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source path isolation, in conjunction with SR 1 ■ i" and IDS j . White SSI and IDS1 arc shown as 

bot^inft-tisien-de^^^ 4- 1 7 may b e compf- i s e d-e-f -eom t iaefgta - Ijy 

avaifehle-rotiter s re^ 

ear Tying -traffic b e tw ee n the an tonotnoo s s yst e ms r narn e i y I AS I . and AS S v and AS 3 : PN 4 
eetHpfjse&-f««t«y&-R-2 • R.6> Links operatively coupling the routers making up PN% andlinks 
attaching to ASs coupl e d to PN 1 - PN1 may aloo comprise computers ext e rnal to an AS (not 

isolation routers (SRs) ar e denot e d as Rx, such as thos e locat e d in PN1 ; wher e x is a --number 

|0031j The lower portion of Fig. 2 includ e s oth e r autonomous syst e ms, : AS2 and 

$ka*H a aay4 3e-ep e *a** ¥el^^ 

( 0032) Th e ie fenost port i on-ef-Figr2-showS ' ■■ aR-a^tOBe ' mous- syste m-ftA.SO us e d4>y-aa 

■ m^adef"tt» - kutnch ajHmae&eEH^Hr-4A^^ 

t€Hfeee4^est-«e«a^^^ a s ing links : In Fig 2 ; Li ha»^ e tv€erfigHF ^ d- ' Stie^#>at4t 

places a malicious packet. (MP1) onto IAS1 for trQtHftw^t&«4e--A-S-l"-via"PNj".""\^Hte-Ptg:"2 

othet-hardware ca p able of placing machine-readable da^aH3«je-frft e *wef i e-*aa^^ 
ef"ef4ft - € - onp«6 t k>ivwkh -- ^^ 

ent-e-a-n e twefkv-it"i& referred to as an intruder or intruding device. 
f#0££| TtvteiiHe&an-attackT^ ^ 

■ a4tak-fe>r-tfaBS«Hs e k>n to one or more destination devices having respect- i ve-il es t-ka^oE 
addresses. In Fig. 2, the heavy Hneo are used to indicate the path taken by M P1 , namely II to 
IDS2 ; IDS2 R6, R 6 R3, R3 R2 , R2 SRI 5, SRI 5 SRI 6, a nd SR16 IDS! (wh e r e hyph e n atio n 

implies operative eonphng betw ee n- n e tworfe-eonipone - nt s ); ^fee thiek dashed link #om IQS i IM 

d e nete s - the intatded iHith-te the targeted device IB. 
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|0034| Det e ction and sourc e path isolation of NIPS may b e accomplished as follows. 

D e t e ction d e vic e , h e r e I DSL id e ntifi es MP] using known methods. After d e tecting MP I , ID S1 
gen e rates a notification pack e t, or trigg e ring ev e nt, and s e nds it to SSI thus notifying: SSI that a 

portk>« s 4hereef#leng wi-^ 

encapsnlation info^ 

b e en4d e ntffled and forwarded to SS I it is r e ferred to as a target packet (TP1) becatee it-4> e eemes 
th e targ e t of th e source path isolation method farth e r d e scribed h e r e in. 

SSI may then g o n o ratc a query message (QM 1 ) containing TP1 , a portion thereof, 

er-^eprese n t ati^ 
bfeffltatioft-ak)^ 

s- end ' QM l to participating routers located one hop away; how e v e r the disclos e d invention is not 

Mmke4404H- ag le --hep S : 1 F - of-« - >HMft f >^ ^ ^ 

SR47" a fe4wo - hop$ - aw^ 

receives QM1 from SSI, SR .16 det e rmines if TP1 has boon scon. This determination is made by 
e - empawng TP l w4&-a4afebase-eefl^^ 

e ncount e red, a pack e t when th e packet is pass e d from one of its input ports to one of its output 
(0036) To determine - if a pack e t has been ob s erv e d, SRI 6 firot s tores a r e p rese n ta tion of 



■i"P-l"-eenta-ined in QMI . Typically, a representation of a packet passed through SRI 6 will not b e 
a copy- of -the ew^^^ 

tn>iq* ^- ¥ak ^ -^ S l ne e - f ^ 

second^ storing 

I n contrast, sto r ing a v a lue representative of the contents of a packet us e s memory in a mor e 
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allows the entire packet to bo uniquely identified. A hash value, or hash digest, is on example of 
is-coiBfmted-Mre s s- e a^ 

difest-niay-W IJsifig4h e - ' -d^^ 

j ^ sing-feengh-a-;^^^^^ 
i^femttrt i #n - -a^^ 

pack e t has not been observed, and that will r e spond positively (i. e . in a predictable way) when a 
d e iwiag - r e pr e s e ata^^^ 

ex e mpfafy"f e pf e s e «tattofts -- ef - j>aek e fe"hftvmg passed through a participating router. 

■ jttfem t-S S 4 v B u t if S R 16 has a ■■ha6h--ata^ehiag--1^4;-it--fflay-i?&»c4-a-Te$fte^e-te--S$-l-- indie - atang-t - fcat : 
ths-ftadfcefr^atH*!^^ k^addid^ 

routers 1 hop away. In Fig. 2, SR16 sends QM1 to SRH, SR15 and SRI7. Then, SRH, 15 and 
17 determine if they have seen TP'i. and notify SSI accordingly. In this .fashion, the query 
message/reply proc e ss is forwarded to virtually all SRs within an AS on a hop-by -hop basis. 

f0O38| In Fig. 2, routers SR 1 1, SRI 5 and SR 17 are border routoro for AS I, namely they 

ar e th e rout e rs that contain routing tab le s for routers outsid e AS I . If rout e rs external to AS i 

47t - he^w r iftlH^ 

-When-theS R -ek* ^ 
by^fee4fm«le*Hw4fe^^ 
Protocol (IP) address ona-pa^ 

f*M>39| Still referring 4 -and th e foute taken by MP1 tithe routers making up P - Vl 

are-not^artieipato^ 
: £P T presej&*a ^*4n j H^ 



II 



In re, U.S. 10/654,7? 1 Changes made to 09/88 i ,074 to create 

ClPapp 1 0/25 i, 403 

routers making up PN1 wore participating as SR.-. then R6 could be instructed to exclude TPs 

|0040| Th e proc e ss used to perform source path i s olation in Fig. 2 is referred to as an 

que«e s 4R>m-^ 

figure 3 

|W44| F4g : - 44 l ^ 

d e noted g e afifaHy-a s B connected to external networks EN1 --EN7, other rout e rs within 300 
connected to the border routers generally denoted as A, and a source path isolation server 
deaete d - as-SS - . AS - £00 - n>ay - ^ 

outward one hop at a tim e until th e border rout e rs, B, ar e r e ached. For Fig. 3, the routers-lab e led 

Arage-qaefied on the 4 «sfclrep~a*Kl4l^^ 

hopv - "SfflC -e- t he4eeatie»^^ 

ai^>4>e-e*Bpleyed-— 

i H"twm - q ; u eiy the routef-s labeled A. As can be s e e^fren-r-Fig-.---3 v -an--^ 

jpyeggesa g^y^iese^ ffhe^fee i es e d- te ekR^^ 

containing virtually any number of participating routers. While inward - out and outward- in 
t^4«y*}-ues4m' e -^ 

r^ttt^ » 4ee - at^l--^^ ^ 

teetekfues^n%eenffi^^ 

a%d-& e jabte? 

{0043-1 Puithef-detadeT^^ 

j ^d*4 s ek*fe>n^^ 

EXEMPLARY METHOD FOR SOURCE PA TH ISOLATION SERVER 
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{00 4 3J Fig. 4 illustrat e s an exemplary m e thod for accomplishing source path isolation. 

The method begins when SSI r e ce i ves TPI from IDS 1 op e rating wit h in AS I (st e p -1 0 2) . 
|00 4 4 j A4tern?eee f¥k»g44M- ^ ^ 

Examples of additional information that may he includ e d in QM1 are, but ar e not limited to, 
enoFyptien&ey^ 

away (step 406). SR. may then process QM S by hashing TPI contained ther e in and comparing 
the resulting value to hash values stored in local memory, where the stored hash values identify 

1110451 Aft e r proc e ssing QM1 , an SR. may send a r e p l y to SS I (st e p 408). The response 

tnay indicate that -a queried router has seen TPI ^or alternQtlvely: that it has ftot (step4IO).— : fa"k 
i-fflfM>rtam--tt>-el»er-¥^^ 

not hove a hash matching TPI , SR. ha s definitively not seen TPI . However, if SR has a matching 
hash, then SR has sees TPI or a packet that ha s th e same hash as TPI . When two different 
paelvetev-h3*fflg--di-fe 

ffl046j lf-a-que«ed"SR"ha^ee tt - T4 ? 4 ^ 

fespeet»ve4>&*HM^^ A4 t emaft¥ eiy r i £^ 

TPl r tbe - ^ Repl ie s r e c ei v e d #om qtieried SRs 

are used to but id a source path trace of possible paths taken by TPI throngh-t^-aet-wofk-ttsiag 

known m e thods (step 416). SS I may then att e mpt to identity the ingress point for TPI (st e p 

4 4%-4l : SS-f%-a*Hi^ 

paftieipamg-^ 

a-g-a-ia-( : s t e p-424)v 

fQ9 4 -?j £xaH»ple»«fc^ttfee-pa^ 

embedimente-di s ele s ed-her e in-ar e -btita 

se arch- - fa- - a-h^adtfa^ 

Im^e-obse^ ^ l-a-t^ 
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the responses received by SS 1 . Whore the nodes indicate locations that TEH may have passed, 

A«y -- gra : phs - ^ntajm^ 

p&tlw r i;ey-j>athMhat - ^ 

knMten-^ Sfts - % s tt^ 

p^in* s 4^ - ^^ 
r ^ pen ^ l- w ife ^ pesi^ 

ouH^&ep^^ 4feproe es ^ 

routers hav e been qu e ried or al l SRs in a round respond with a n e gative r e ply indicating that th e y 
h a v e n ot o bs e rv e d Tin. When a negativ e r e ply is r e ceived, it io as s oci a t e d ao in a ct i ve path d ata. 
j(M>48| WfeeflrSSI^MK^detegmitoe^^ 

1PS1 indicating that a solution has b e en found (st e p 420). Oft e n it will be desirable to -have the 

mge es &fw fth -using ktH?w«-te6hj i>qtt e s - -(8t e p - 4S3->.- SSt-t«ay-al »o-af ch i v e p afe^«t k>ns;-ito. -- s e »^ 

data f e e et v e d; and th e l i k e ei th e r l eea l ly er f e mQt e ly: Furth e rmor e ; SSI may eernmnnieate 
iftfo n n a-t-i on about source path isolation attempts to devices at remote locations coupled to a 
n e twork. For exampl e , SSI may communicat e information to a n e twork op e rations c e nter 

[00 4 9} Her e it is not e d that as SSI attempts to build a trac e of th e path taken by TP], 

tadiea^iea-to i ^9-4 e siF e ^P^ k t& k ee n - e^ 

e a n -be mitig a ted 

value decr e ases- Anoth e r mechanic 

value and setting a single bit for an observed pack e t, a plura l ity of hash valu e s are comput e d for 

ea£4frel^ e r^ e d-^a^k^ 

ftumb e i> -< >i"imjqu e 4^ 
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hash tabic at a faster rate, the reduction in the number of hash coUiatons makes the tradeoff 

EXEMPLARY DATA STRUCTURE FOR STORING TRACE INFORMATION 
ceafiHK4ief.v-w-ife- ^ 

da t a st r uct u re, it will be obvioua to those skilled in the relevant arts that a plurality of data 
structures may b e emp l oy e d and that th e data structures may includ e additional parameters and 

param e ters, having data associat e d th e rewith. In the upp e r l e ft portion of Fig. 5 ar e -three 

Ta^e^44Vfrt*me^ 

t naehk ^ y e adab leH*^^ 

or firewall Time may be us e d to id e ntify e ither the time at which TP was r e c e iv e d at an SS, the 
soweeH«ay45ei«sed-«^^ 

{0052} Within 500 ar e ex e mplary column h e adings indicating s till other attributes that 

identiffe a tien a tte^^^ 

rou ters , s witche s, b ridges, or the like, within q network that h a v e been, q ue r ie d by SS. Link may 

be~n ^ l484d e m-ity-- ^ 

shf>wt* - a» ^ ^ 

may indicate th e time, pr e ferably using some common refer e nce, at which a resp e ctive node 



15 



In re, U.S. 10/654,7? 1 Changes made to 09/88 i ,074 to create 

ClPapp 1 0/25 j. 403 

observed TP. Time is useful for assessing how long TP has been in the network and for 
p e f - fer - mmg-eei^^ 

itsed-fe-tmefe-vaf- i fta^^ ^ ^ been 
tr-ansferHmed: r tf-m^ Fer-eMample 

St a te s -may-o e -^ ^ d^^^ 

' .M i " may indicate thai, a link has b e en disab l ed to e xclude data traffic. 
\mu\ Fig. 5 illustrat e s on e exemplary embodiment of a data structure that may b e us e d 

of records may be readily emp l oyed without departing from th e spirit of the inv e ntion. For 

ex-af^kr the ^ er* ^ 

fr - aft s fe f B e d and states-may-be d^^^ 

8ag»-m*e4^tts-4~^^ 

p lural ity of r e cord s to^ Additionally; 
other c olu mn entries may be used in conjunction with, or in piaee of those shown in Fig 5. For 
e xampl e , it may he desirabl e to associat e the hash value, or alternativ e ly, th e contents of TP with 
each-record It may ako^ e- deakafe ie -toh^ 

e ae - oum e r e d^r r aU e fflat^ AacM* 

aiay-b e -tksirable to have still other data structures or records associated with source path 
se-liatieas-that-hftv e b e en g e n e rat e d iu r e spons e to d e t e et e d TPs - ; 

One pa .v. k is a sell i \ ran-4lt ted computer 

program, soeh as a virus or worm, that is designed to annoy network users, deny network service 
by ox t-t loading the netxxo s i g tCMe g,. by delen- files) \ virus is a 

^ and propagating 
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itself when that program is executed posssh h dc-tu;- m-/ hk- or wiping out memory devices. A 
worm, on the other hand, is a program that can im pies jfj t >*d ,pj id tself throug h 
connected systems, using up resources in affected computers or causing other damage. 

In recent, years, viruses and worms have caused major network performance degradations and 
wasted in; I; ions . f i i in- hours in clean-up operations jncorffo ai u anes all oyer the 
world. Famous examples include the "Melt-oft" t-matl \uus and k : ( - de kid" worm. 

Various defenses, such as e-mail filters, an ti- virus programs, and firewall meck -m m ; have 
Ixen emp'o - < id worms but with limited su^ . , Mien u-iy on 

computer-based recognition of known viruses and worms or block a specific instance of a 
propagation mechanism (i.e.. block e-mail transfers of Visual Basic Script f.vbs) attachments). 
New yiro^s andwo.™^ 

s ■ • . Here is a need for new defenses to thwart the attack of known and yet-to-be- 
developed viruses and worms. T here is also a need to trace the path taken by a vims or worm 

SUMMARY OF THE INVENTION 

with the ptc&ent nvu'tvuon addtess these and other needs by 
viding i\ ] lefense dnit attacks mak us pat kj uch a it id trn i "> ii < 

common denominator (i.e.. the need to transfer a copy of their code over a network to multiple 
i.i; y systems re tin < < t i ih - \ r each en thongS ej Qh 

messa ge containin g the virus or worm may vary ). The systems and meth ods also provide the 
ability to trace the path ot pro} tg iti a back to the poms of origin of the malicious packet (i.e., 
the pkie at J ' ij w ni alls nncaed nt<> the network). 

„ „ *j i ition as c v> v nd i ! % ibedj 

dett he transmits it j £ malicious packets. The system i c tcke 
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at 1 > u i iies has! - - 1 espondmg to each of the packets. 1 he s\ st , m max men, compare 

! . h n 1 ! _Y2 > IL. r J ! I 1 ! < ^ i 1 em ma 

determine that one of the packets is a potent ial k n i , u vet u hen the generated hash value 

respond { the one j e latcl i of the ha h orresp _ ! < _ 1 o ie_ 4 i ,(> l l I 

packets and.the.one.prio 
packet. 



According to another implementation consistent with the p resent invention, a s ystem lor 
ham penny. |ran , m > sign of a potentially malicious packet is disclosed. The system includes 
■ - - ^ 'ng a packet , nu .v I s .-r ating one or more hash values from the packet; 

means for comparing the generated one or more hash values to hash values corresponding to 
prior packets: means for determining that the packet is a potentially malicious packet when the 
gmexated.OM.or.inoie.hasJt.v^ 

. a -.' . - me of the prior packets and the at least one of the poor packets was receiv ed withi n a 
predetermined amount of time of the packet; and means for hampering transmission of the packet 
when the packet is determined to be a potentially maiicioas packet. 

According to vet another implementation consistent with the present invention, a method for 
detecting a path taken by a potentially malicious packet is disclosed. The method includes 
storing hash values corresponding to received packets; receiving a message identifying a 
poty il nalici >aeket; generating hash vah f'roi se potentially malicious j: ej 
comparing the generated hash values to the stored hash values; and determining that the 
potentially mala to n ! i . or more of the generated 

hash values match the stored hash values. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings, which are incorporated in and constitute a part of this 
specification, illustrate th e in vention and, together with the description, explain the invention. In 
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the drawing s. 

FIG, i u U u iioj t m ) i > . ii tern md met < t nMstent w ith the present 
invention may be implemented; 

FIG. 2^ is.M.MCTiplMy.diagmm M.asecu? H% sestet of HG I plem 
consistent with the principles of the invention; 

FIG, 3 is an exemplary diagram of packet detection logic according to an implementation 
consistent with the principles of the invention: 

FIGS. 4 A and 4fi illustrate two possible data structures stored wi thin the hash memory of FIG. 3 

FIG. 5 is a flowchart of exemplary processing for detectin g and ot pu v ( i u tission of a 

malicious packet., such as a virus or worm, according to an implementation consistent with the 
principles of the invention; 

FiG, 6 is a flowchart of exemplary processing for identifying the path taken through a network 
by a maliciou s packet, such as a vims or w orm, accordin g to an implementation consistent with 
the principles of the invention; and 

FIG. 7 is a H .it j ernpia * a ing j det ammmg whether a malicious packet, such 
as a virus or worm, has been observed according to an implementation consistent with the 
principles of the invention. 

DETAILED DESCRIPTION 



Hie following detailed descnpl \ it nti 1 • 'i the a^uiinp,ni\ ing dum s n u -> 1 he 
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same reference luunbess ..in .ditTeiut d ngs may " ! t M s i_ » i e nents Also, 
the following dcu.f , coo- not limit the invention. Instead, the scope of the invention 

is defined by the appended claims and equivalents. 

Sy&ems andm^ 

prevent th e transmission < ■■; >;i t >hc}Ous packets and trace the propagation of the malicious packets 
through a netumk Mahuo . ( u , «i, us^d hetetn tn n include viruses, worms, and other 
types of data with duplicated content, such as illegal mass e-mail (e.g., spam), that are repeatedly 
transmitted through a network. 

According to implementations consistent with the present invention, the content of a packet may 
be hashed 10 trace the packet through a network, in other implementations, the header of a packet, 
may be hashed. In yet other implementations, some combi nation of the content and the header of 
a packet may be hashed. 

EXEMPLARY SYSTEM FOR PERFORMING METHOD 

[0054] FIG. 6 illustrates a system 620 comprising a general-purpose computer that can be 

configured to practice disclosed embodiments. System 620 executes machine-readable code to 
perform the methods heretofore disclosed and includes a processor 602 CONFIGURATIQ N 

FIG, i is a diaiiram of an exempian \ stem 100 in whs J? v i u j»u d i ^i-,tent mih 

ej i tion nurv be implemented Sweir 100 mcludes .mttmomoiis systems (ASs) 
1 10-140 connected to pub c PN) 150. Connections made in _ in 100 may be via 

wjre4..wireje.ss x and^pr o^ 1 shows four ; autonomous 

systems connected to a single public network, there can be more or fewer systems and networks 
in othei implemenuiv ^ ( o intent with the principles of the invention. 

Public network 150 may include a collection of network devices, such as routers (R1-R5) or 
sw itches, tha' I ansier data be! \ een autonomous s\ stems, such as < no mo us s\ l - ms 1 10-140. 
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In . imj me ntation consi rem vithjh jy an • > mtion, public smo' l- ! ~'> ^ s the form of 
the Internet, an intranet, a public telep hone network, a wide area network (WAN), or the like. 

i 1 11 ! Li l> 11 <1 „ _ !i H 

dpjnain.can..exch ; ange routm 

area tie tu mi (LA v a % ^ a fflefiOpoSitan area nety, or 1- ,>l LANfj Vs.au! to nous system 
mav include computers or other types of communication dcs uv ^ (i fore ' t° ^ "hosts") that 
connect to public network 1 50 via an intruder detection system (IPS), a firewall, one or more 
border routers, or a combination of these devices. 

Autonomous system 1 10, for example, includes hosts (H) 111-113 connected in a LAN 
configuration. Hosts 111-113 connect to public network 150 via an intruder detection system 
M4 Intruder du, , an I mo s im de i aim netualls n .nlable deuce thai uses ruk- 

based aigomhms to deter t , given pattern t etvs >sk tra i is abnormal The general 
premise used by an intruder detection system is that malicious network traffic will, have a 
different pattern from norma!, or legitimate., network traffic. 

Using a rule set, intruder detection system 1 14 monitors inbound traffic to autonomous system 
1 10 V\ hi' t i it mi system 114 may take 

remedial action, or it cay;. instruct a border router or firewall to modify operation to address the 
mat om^ '' i .o x pattern foi example, ion o , rn iay inchid h J>i" ig the link carrying 
the malictoua ..traffic., discarding packets coming from a particular source address, or discarding 
packets addressed to a particular destination. 

A mono moo- . . JtfTejent deuces from ataonom. - , 1 10. These devices 

aid autonomous system 120 in identifying and/or preventing the transmission of potentially 
malicious j\ > n autonomous system 1.20 and tracing the propagation of the potentially 
inaiiciou> ,' I i i d . pos- -\,pam>. network 150. While 

FIG. I shows only autonomous system 120 as containing these devices, other autonomous 
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§J „ 1 >< I |dy . j_ nous system i HX may inch i i 

Autonomous system 1 20 includes hosts (HO I2i-l23, intruder detection ^ . md security 

■ '■'■ ' j ...V'..;. >i nco .v ; ■ j ij ;s vork 1 SO via a co]J :ction of devices, such as security 
routers jSRy.-S^ 

communicationdeyic^ 1 . W . -o : t fl u it rat ion J ntmder detection 

s ystem 124 may be configured similar to intruder detection system 1 14. 

Security server 125 may include a device, such as a general-purpose computer or a server, that 
performs source path identification when a malicious packet is detected by intruder detection 
system 124 or a security router 126-129. While security server 125 and intruder detection system 
i 24 are shown as separate devices in FIG. I , they can be combined into a single unit performing 
both intrusion detection and source path identification in other implementations consistent with 
the present invention. 

FIG. 2 is an exemplary diagram of security sever 125 according to an implementation consistent 
with the principles of the invention. While one possible configuration of security server 125 is 
illustrated in FIG. 2. other configurations are possible. 

Security server 125 may include a processor 202, main memory [[604]]204, read only memory 
(ROM) [[606JJ206, storage device [[60SJJ208, bus [[610JJ2J0, display l[6i2j|212, keyboard 
[[614|]214, cursor control I |Oi6H2 10, and communication interface [[618.]]2Jj Processot 
[[602JJ202 may [[be'j'j include any type of conventional processing device that interprets and 
executes instructions. 

Main memory f {"60411.204 may ffbell include a random access memory (RAM) or a similar type 
of dynamic storage device. Main memory #04 -- $ t of- e- $ 204 may store information and. instructions 
to be executed by processor [[602,]] 202. Main mernon la i! . be used lot storing 

temporary variables or other intermediate information during execution of instructions by 
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processor [[602. JJ202. ROM 606 stor es 206 may store static information and instructions for use 
by preeessoiiiuT j]202. it cv-'ill he appreciated that ROM [[606)] 206 may be replaced with some 
other type of static storage device. Storage deviee[[ 6081)208. also referred to as a data storage 
device, may include am type of m gneti< i >ptica n ia u 1 the rconesf moling interfaces 
and operational hardware. Storage device 608" St fc>r e s 20.8.nay.$torc information and instructions 
for use by processor [ [602. j | 202. 

Bus 610 incfudes 210 may include a set of hardware lines {conductors, optical fibers, or the like) 
that allow for data transfer among the components of system [[620.11 security server ! 25 , D isplay 
device IT6 121 12 12 may be a cathode ray tube (CRT), liquid crystal display (LCD) or the like, for 
displaying information in an operator or machine-readable form. Keyboard [[614]]2J4 and 
cursor control [[616 1] 2 16 may allow the operator to interact with system [[620. I lseeuritv server 
125. Cursor control ff6 161 12 16 may ffbel lindude. for example, a mouse. In an alternative 
configuration, keyboard [{614)1214 and cursor control [[616JJ216 can be replaced with a 
microphone and voice recognition [ [means] j m echan i s ms to enable an operator or machine to 
interact with system j i 620. | jsecurttv server 125. 

Communication interface [[618}]218 enables svst e ro 620 $ecuritv server 125 to communicate 
with other devices/systems via any communications medium. For example, communication 
interlace [[61811218 may [[be]] in cl u de a modem, an Ethernet interface to a LAN, an interface to 
the Internet, a printer interface, etc Alternatively, communication interface [[61 8jj2J8 can 
[[belj inclnde any other type of interface that enables communication between system 

Osecri ty_ 125 and other devices, systems, or networks. Communication interface 
[1618112 1 8 can be used in lieu of keyboard 1 [614112 14 and cursor control [[616112 16 to facilitate 
operator or machine remote control and communication with security server 125. 



As will be described in deta 1 be > > tww-^ '^ e 25 ma) fHH4d tf-- ^$4"i>p e fating 

xvithki-AS I- with--te-abMrty te-- p erform source path isolation teen . i m d ot prevention 
mea y es foi a g?-*tm-4-P- SM - mai \ J ' f it . d mous syum 120. Security 
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I ! v, 12 J rmo -i- 



jMIM from IDS 1 and a 



- — H-j i : in response 



to processor j [60 1 202 executing sequences of instructions contained in, for example, memory 
[[604. 1 120 I Such instructions may be read into memory [[604]]204 from another computer- 
readable medium, such as storage i ce [[60S 20 t m another device coupled to bus 
[.[6iO]]2jO or coupled via communication interface tH8: - B^eeirtie« - ^ 

receiving a targ e t packet (st e p 402), r e c e iving r e plies from qu e ri e d rout e rs (step -108). and 
b ui lding a t r a ce of the path trav e led bv TP (st e p 1 1 6). 218. 



Alternatively, hard[[-]]wired circuitry may be used in place of or in combination with software 
instructions to implement, the functions of SS I- ^hu s r4h e- di s e : bs e d e mbodim e nt s - of SS4 a?e not 

example, the functionality may be implemented in an application specific integrated circuit 
(ASIC), a field-programmable gate array (FPGA), or the like, either alone or in combination with 
other devices to provide desired functionality . 



CONCLUSION 

fa-e-itkat e -souree path isolation of .malicious packets in a network. While the preceding disclosure 
is direct e d to an Int e rn e t Protocol (IP) network, disclos e d e mbodiments can be us e d in 



{AXM)% synchronerts opt i eal:-u e iw6i-k (SQ^E^ : > r an:d th e like- fa add i tion; disclos e d 
e mbodiments may be adapted to operate within different kyers-^f-a-Hetwefk-^fe-fts-fefr-^ata-liak- 



m$$\ Furt 



a th e dir . 



hods for impl e menting a 



ce path isolation 



n - gje-prr 



r-hardwa 



e xampfe r s < >ftwaf e 4e^^^ 
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programming language such as C C ■ . LISP, or the like. Alternatively, software may he 
i n3pl e m e n ie c44« - ^4gw^ 

where-f- e q»r&fti e Rts-stteli"ftS"Spe e d mu3t b e m e t Furthermore, SS may b e -eonf i gtifed-te 
eenimwiieate-w^^ 

te4m-v e --S S --m a^ 
»^:He - H f % - ne^ 

by e mploying .multip l e processors or by having various compon e nts physically separat e d and 

uaiftg4ie"T^v^rk - €af^ - iag - date - tmffH - " ft mong the SRs. For example, using a dedicated network 
ftiay-previd e... ^ 

tkat-ene-e? mor e iink s- tfr - aft - SR- ifr- tUg a & te dv 

■| 00l > 7| Q t te ry mea sage<H*^Ms)-a4*d-re^ 

p»u kft type ln-*«any4n s ta»e es ^ 

readily known protocols; however, customized protocols and messag e types can be used. For 
exampl e , it may be desirabl e to employ a smart packet for sending QMs to participating rout e rs 
A-^nartpa^ e t^on e -tto^ 
ak»g - M4fe - maefeft e H^adfi& e 4nsm^ ^ ^ 

madi-fy-its operation in response to the contents of the executable instnietions contained therein. 
Smart pack e ts facilitat e rapid respons e s to n e twork intrusiono by allowing an SR to modi f y 
ep e r a tk^ s m * n^ 

{0058} Fnrthe^ofe r the-4isele^ 

wtnrifckb e-ia *eet^ ^ ^ ^ 

aneth e fi or- a pack e t was spht fe As can be seen- maay 

invention. 

44m^feM%4fa e -y^ent" e ^ 
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foregoing description, and all changes within the moaning and range of equivalency of the claims 



1 \ U it 1 t _ i I i 1 • 1 

may, deject and/or grey m 

fie at fin _ s \euunN louters 127-129 .may include border un:t v ».? j'<>; i ssonomous 
system 120 becaus e t hese routers include connections ic public network 150. As a result, 
security routers 127-129 may include routing tables .for rooters oi t ss -icm 120. 

MO ms an txcmplan du i o i ul:c; detei ion I * > i 3U0 according to an implementation 

consistent with the principles of the invention. Packet detection logic 300 may be implemented 
within a device that taps one or more bidirectional links of a router, such as security routers 1.26- 
129,. In.another.i.m 

such as security routers 126-129. In the discussion that follows, it may be assumed that packet 
i ' gk H10 is implemented within a security router. 

Packet detection logic 300 may include hash processor 310 and hash memory 320. Hash 
processor 310 may include a conventional processor, an ASIC, a FPGA. or a combination of 
these that . i e J packet and records the packet 

rej i - notations in hash memory 320. 

A packet representation will likely not, be a copy of the entire packet but rather it will include a 

u ' i i lut j i the pad; jecai nodern routers 

can pass gigabits of data per second, storing complete packets is not practical because memories 
would have to be prohibitively large. By contrast., storing a value representative of the contents 
of a packet uses memory in a much more efficient manner. By way of example, if incoming 
packets ran^e in . '"torn 236 bits to 1000 bits, a 0 ced v> s-ith number may be computed across 

fixed ed blocks making u\: he on tern (or pavload) of a pad s t nanner that alh v§ ic 

entire packet to be idemified. To further illustrate the use of representations, a 32-bit hash value. 

26 
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u.t digest ma\ ho lumoukd u i;\cd--:;/ed bl 1 < ^ i ckj her he -. h value may 

xned j n 0 n iyj used as an index, itn_ inn hash mei t 

Using the hash value, or an index d. ■ Ukl esuil ■ - . m .ssc of hash memorx 320 

while still allowing the content of each packet passing through packet detection Logic 300 to be 
identified. 

Systems and methods consistent v> tth d ^ , s < aiiu-n ma\ use m . o g heme that 
records information about each packet in a space-efficie nt fashion, that can definitively 
determine if a packet has not been observed, and that can respond positively ( i.e., in a. predictable 
way ) when a p acket h.. . . J , Mthough vy stems and method onsis • vit.h tfu 
present invention can use virtually any technique for deriving representations of packets, for 
brevity, the remai g disci < t will use ha- i , >f packet; 

haying . passed through a panicipating fotiteL 

Mash processor 3 1 0 may determine a hash vaiue over successive, fixed-sized blocks in the 
payload field (i.e.., the contents) of an observed packet. For example, hash processor 3 10 may 
iMtJl^aeh icct \e o4-byte b kj< n j the header field. As described in more detail 
below, hash processor 310 may use the hash results of the hash operation to recognize duplicate 

m es of pa content and raise a warning if it detects packets with replicated content 
within a short period of time. Hash process or 310 may al so use the hash results for tracin g the 
path of a mat it oi packet thr< ugh ie network. 

The hash value may be determined by taking an input block of data, such as a 0 4 -byte block of a 
packet, and processing it to obtain a numerical value that represents the given input data- 
Suitable hash ftmcti ] are n idi [ lov* n in the ait and will not be discussed in detail herein. 
Examples of hash functions inc lude the C v'. 1' ■; m i > ' k ' ZRi ') md \ Dig est 5 

(MPS). 

1 l i. uj nig hash vaku i 1 \ i t m ! _ )r hash digest ifi t 1 g 'i 
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value 1 he hash yah t u 1 1 he data o\ cr v- hii li s t \\ as t omputed t ot 

example, incoming packets icuU ha\i h\n | uej < ^ i ' dieir content. 

i i h val i j r. i niifvin^ nj h oc ofd \ I 

wm. cpmpjtted.tM 

l f v f ! _i oj i >k sju i I n , i 1 llision t acceptable a f sen » 

should provide a good distribution of values over a variety of data inputs in order to prevent 
these collisions. Because collisions occur when different input blocks result in the same hash 
value, an ambiguity may arise when attempting to associate a result with a particular input. 

Hash processor 310 may store a representation of each packet it observes in hash memory 320. 
Hash processor 3 1 0 may store the actual hash values as the pai k \ it may use 

other icdim 

other information associated therewith. A technique tot pi, m. mg tora _l [i f anurK m,\\ 

use a bit array or Bloom filters for storing hash values. 

Rather than storing the actual hash value, which can typically be on the order of 32 bits or more 
in length., hash processor 310 may use the hash value as an index for addressing a bit array 
within hash memory' 320. In other words, when hash processor 310 generates a hash value for a 
fixed-sized block of a packet, the hash va i asj sc addtess location into the bit array. At 

t ddj ponding to the hash value. < t 1 < i . r\ . > 1 ajjj i > 

location thus indicating thai a particular hash value, and hence a particular data packet content. 

ej v 'i 1 isl ssi ! For exai pie ing a 32 ha i rth 

order of 4.3 billion possible index values into the bit array. Storing one bit per fixed-sized block 
rather than storing the block itself, which can be 521 bits long, produces a compression factor of 
1:512. While bit arrays are described by way of example., it will be obvious to those skilled in the 
relevant art, that other storage techniques may be employed with out departing from the spirit of 
th e invention. 



28 



In re, U.S. 10/654,7? 1 Changes made to 09/88 i ,074 to create 

ClPapp 1 0/25 i, 403 

( h u ;iffle, hash memory *20 roa\ fill up d the possibility of o mv an < tmgjndex 

alue inci j foverwri ng tn index vai m v_b edj j. h b art y 

periodically flashed to other storage media, such as a magnetic disk drive, optical media, solid 
e dr i v, i _ laf ; ths may n 1 4\ m m j j • J 

cycle can be reduced h\ ^ mpjmm a subset mg through 

the unite t VUidcUm mpnx \ u-dac\ flushing vib U murages the powbdm thatataig ct 
packet may be missed (i.e., a hash value is not compute d over a portion of it). 

HPS j A. and IB us] p aa;a s u.mu-- i \tt may be stored within hash memory 

320 in impiemenuu on s eon t- tent w ul the pmv spies of jhe Pi^pti^^^^ in FIG. 4A. 

hash memots 320 may include indicator fields 412 and counter fields 414 addressable by 
corresponds^ 
generated b\ hash p a 

! « ^hc ttpr field 412 may store one or more bits that indicate whether a packet block with the 
corresponding hash value has been observed by hash processor 3 SO. Counter field 412 may 
record the number of occurrences of packet blocks with the corresponding hash value. Counter 
field 412 may periodically decrement its count tor flushing purposes. 

mory 3 re add i ling to a pat. ket 

For example, hash memory 320 may include link idem dim tiPt fields > > , fields 424 

I I \l 'V. 12 may s \ n \ n i fi n regard i g i parti alas Imk ip_oj Inch Ik pad 
arrived at packet detection logic 300. Status field 424 may store information to aid in monitoring 
th -•; ttus of packet detec t- n logic 300 oi the hnj .dentitied by link ID field 422. 

in an alternate implement' < pj pjj e ivention, hash memory 320 

ina > be ptc p . i ■ di^ 1 o rtahcous packets, c-itch as 

known viru md v i 1 net y 320 may re these hash vali >epai uei o +l 
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hash values of observed pac' it- i n t i i ^ -: \ < hash processoi MO mav .compare a hash va.l ue for a 
rec e i ved packet to not only the hash values of previously observed packets, bat also to hash 
values of koe wri ma 1 id ous packets , 

Myet.MPthetjm^ 

may be preprogrammed to store source addresses of km _ < » > ! 2 > ue duplicate 1 
content, such as packets from a multicast server, a popular page on a web server, an output from 
a mailing h-: 'V.pi Jet" server, or the like. In this case, hash p rocess or 310 may compare the 
source address for a received packet to the source addresses of known sources of legi timate 
duplicated content, 

EXEMPLARY PROCESSING FOR MALICIOUS PACKET DETECTION 

FIG i is a flowchart of exenjj 5 . gg and oj pu vei i ssipa. of a 

us packet , such as a virus or worm, according to an implementation consistent with the 
principles of the invention. The processing of fid 5 may be performed by packet detection logic 
300 within a tap device, a security router, such as security router 1 26, or other devices 
configured to detect anchor prevent transmission of malicious packets. In other implementations, 
one or more of the described acts may be performed by other systems or devices within system 
100. 

Processing may begin when packet detection logic 300 receives, or otherwise observes, a packet 
( act >05 gsi proc or i M ay get rate on or more hash values b\ h d s.j access 
fixed-sized blocks from the packet's payload field ( act 510). Hash processor 310 may use a 
conventional U ch i pie to p erform the hashing operation. 

Hash processor 310 may optional!) n-r \ 1 I ms? to lush \aiues of 

c yj ijxv _ jt I i on n vithi lash mem Y2Q i < _ v ±' 1 i ,V ™1 J i ' 01 O 1 ' ' 

may be pre; Luamn jj t h\ it e j < i * g to ! i , « t ri md/o nm 
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one or mose < f the generated hash ^ » i u o > . mate h one o f the hash \ , > t - ■ ?s oi knj \ n \ m uses and/or 
worms, hash processor 310 ma > ta K ... actions ( , " ■ ,"!0 and 525). The ; e -. , tction 
may inc lude raisinii a wa; ni in: for a human operator, delay ing transmission of the packet, 
re< i iming human c - - ^ r li i union before tr ansmissioi of the pa ckei , droj iping t je packet ant ! 
possibly other.packeta.oii 

send uj. a 1 i ission I ou o ( I ( Pi dos^. menage to ei i \ pK\ustmg, 

complete transmission of he \ ket eacrm the hul- on ^ Inch the packet was received. 

and/or corrupting the packet content in a way likely to render any code contained therein inert 
(and likely to cause the receiver to drop the packet). 

If the ge m • f v alue( s ) do not matc h any of the hash values of known viruses and/or 

wotms, oi if such a u'lDj' - . is ■\ >t performed, hash processor 310 may optionally 

. - guimaie source of d rj led packet content (i e.. a legitimate "replicat or" > <acr. 53 Q) . l ; or 
example, hash processor 3 10 may maintain a list of legitimate replicators in hash memory 320 
and check the source address of the packet with the add' s < j tej rs on the list. 

If the packet's source address ma tches the address of one of t he legiti mate replicators, then hash 
processor 310 may end processing of the packet. For example, processing may return to act 505 
and await receipt of the next packet. 

it i < )v 3 \ 0 it lii < lether am packets with 1 < ) 

valuefs) have been received (act 535). For example, hash processor 3 10 may use each of the 

generated has! ue(s tdd rj ush me mo 0. Mash | « 10 may then 

examine indicator field 412 (FIG. 4) at each address to determine w hether the one or more bits 
stored therei n indicate that a prior packet has been received. 

If there v> eic no prior pa^A. t : u-a). then hash processoi 3 10 may 

•ecord g. ,! t value 2 m tash n ion 520 (act 540). example, hash \ > 

310 may set the one or more bits stored in indicator field 412. corresponding to each of the 
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generated hash values, to ..indicate thai the i i * y on bjig 4 ad et v\ i K>b „ t • cd by .bash ..processor 
310- Process i n g ma v then return to act 50" u> aw a.: receif ■- ■ ■i'thc next packet. 

1 i f 1 le tou h; j i , v i ' 1 \J tii me I t 

M§kpmggg.§^3.1.0.fOJ> - - her.rtH ^ < < HS) 1 fash 

proi essor ? 1 0 max use a set oi i i > to determine whethct to id' n ifs ] > i 1 e is potentially 
malicious. For example, die rules might specify that more thaiUmies.(where.tunes > 1 ) packets 
with the same bash vaiue have to be observed by hash p rocessor 310 before the packc ■ ■ aj 
identified as potentially malicious. The rules might also specify that these packets have to ..have 
been observed by hash processor 310 within a specified period of time of one another. The 

1 for the latter rule is that, in the case of malicious packets, such as viruses and worms, 
muitipie pack c; - ; sli li kely pass through packet detection logic 300 withn i hoi p enod of ti me. 

A packet may contain multiple hash bloiU thai pa;., <^ n ohb h bloc.l |ss i<i_L ilki 01 
packets. For example, a packet thai includes multiple hash blocks may have somewhere between 
one and all of its hashed content blocks match hash blocks associated with prior packets. The 
rules might specify the number of blocks and/or the number and/or length of sequences of blocks 
that need to match before hash processor 310 identifies the packet as potentially malicious. 

When hash processor 310 determines that the packet is not malicious (e.g., not a worm or virus), 
>uch a ej i_ 1 - _ in j < < j kets with th tme ha It s al f than a 
predetermined number of the packet blocks with the same hash values are observed or when the 
packets are ol i ck he specified nod in has j oces i i > ma; i th 

generated hash value(s) in hash memory 320 (act 540). For exampl e, hash processor 31 0 may set 
the one or more bits stored in indicator field 412, corresponding to each of the generated hash 
values, to indicate that the corresponding; packet, was observed by hash processor 310, 
- mav then return to act 505 to await receipt of the next packet. 

When hash ptoiQsyii ^HUk -\-» m > that the ] i ket may be malicious dun hash processor 310 
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mayiake ren '-.''jal ;u:;i<ms (act 550). In some cases, it may not be possible to 

he ) <et is a n na sbc i [bet is some prob ty that th > t f n L2i 

a legitimate replica ti on. As a result, hash processor 3 10 may detennine the probability of the 
packet, actually being malicious h,t J ■ i if it) i t _ s ■ k,' I u t ■> ,cs-,oi 3 it) 

The remedial actions may include raising a warning fbr.a » s i - 'ic packet for 

human analysis, dropping the packet, mnuptmg uV ■ ids i -ntciu m t wax likely to render any 
code contained therein inert (and likely to cause the rec eiver to drop the pack et), delaying 
transmission of the packet, requiring human examination before transmission of the packet, 
(hopping otliL-i p,K;s ■ . . ■ ..■ong bom the same IP addict as \\u paiL i . iCl'clt^e 

message to the sender thereby pre venting complete transmission of the packet, and/or 
disconnecting the link on winch the paCk . < " si actions, such as 

droppingorcormptingthenae^ 

malicious ; b ue thre ! I h:s may greatly slow the spread rale of a virus or worm 

without completely stopping legitimate traffic that happened to match a suspect profile. 

EXEMPLARY PROCESSING FOR SOURCE PAT H IDENTIFICATION 

FIG, 6 is a flowchart of exemplary processing for identifying the path taken through a network 
b y a malicious packet, such as a virus or worm, according to an implementation consistent with 

PJ \ i in. The pro g of Fli ia\ formed by a untv server, 

such as security server 125, or other devices configured to trace the paths taken by malicious 
packets. In other imple mentations, one or more of the described acts may be performed by other 
systems or devices within system 100. 

Processing may begin w ith nrtu^k i -■ , ... m -a stem 124 detecting a malicious packet. Intruder 

1 e -naliueus packet For 

< limp] "J it y fej del ;ctio.n 1 i 4jnay u . rulc-l sed j -onthn f . d .jif y a pad ii 

u t of a iprmal nel I i i s n a malicious packet is dc inn 
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I m v i 1 ( ! _ ir s i p > I ! c u i d 

within autono mous system 120. The notit iat on ma\ uul t k the mai tsj 1 ot portions 
thereof along with other information useful for seem m se; \ o ; i e path 

identificat l j arur. < , >f in format ioi thj intrud dete LPJ1J i i ma\ send t< se uritv 

server. 1,25 along with t] t inai.it i us pat ket i nclude time-of-arrival information, encapsulation 
■ilk ioi yo l ml ' s Mi iik old ' i,! * 

After .receiving the malicious packet, security server 12 5 may g enerate a quer y that ineiudes the 

SS packet and any additional information desirable for facilitating communication with 
participatin g rouk i _ c " and '-It') f vnnpL-s 

additional information that may be included in the query are, but are not limited to, destination 
addresses for participating routers, passwords required for querying a rooter, encryption keying 

mformatio^ 

Securit y server 125 nm iu n send the query to security router; s) located one hop away (act 
615) The security router(s) may analyze the query to > e i etliei the> have seen the 

malicious p acket. To make this determination., the security router (s) may use processing similar 
to that described below with regard to FIG. 7. 

After processing the qik \ t } may send a response to security server. The 

res ponse may indicate that the security router has seen the malicious packet, or alternatively, that 

i t < i 1 i npo tan to obse; ti ^ ^ is we ire not eqt \ht kt 4 
certainty. If a security router does not have a hash matching the malicious packet, the secunty 
routet ha> d< i ■ i i elv t en the malicious ickei If the : itei laj • tatching hash, 

however,. then the security router has seen the malicious packet or a packet that has the same 
hash \alue 1 ^mus packet. When two different packets, having different contents, hash 

to the same value it is referred to as a hash collis ion, 

Phe security i la i j piery t outers or d ^ t _j hk b v \ 

i i H iK' 1 ! <a _ U ity ro ) may foi rd the query to the s 1 y 
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that are located two imps a\ „nn security sen er, v\ Iik > n i\ ft rw rrd rhj cur ^ to security 
routeri h\ , ' i ■ :nxa>. and so on. This forwarding may continue to include routers or 

devices within public nemoth 1*0 st these touteis > 1 t ice i m b - 1 tt i^med to participate 

i cui iftl J ihu 1 i * * I 1} ,j ; k an. \ rc 

OAtf.approadi.becau$e 

an outward-in approach may be used. 

Security server 125 receives the responses from the security route rs indicating whether the 

security routers have seen the mal icious packet (acts 620 and 625), If a respond nd cat tha 

the security router has seen the trials .\kk.\ -Vvin ■!> mi 1 > ^ t spouse and 

identification (IP) information for the respective security router with active path data (act 630). 

Alternatively, if the [ gj^jo me indicates that the security router has not seen the malicious packet. 

securir^^ 

inactive path data (act 635). 

Security server 125 uses the active and inactive path data to build a trace of the potential paths 
taken by t he malicious packet as it traveled, or propagated, across the network ( act 640) . Security 
server 125 may continue to build the trace until it receives all the responses from the security 
routers (acts 640 and 645). Security server 125 may attempt to build a trace with each received 
response to d etermine the in gress point for the malicious packet. The ing ress point may identify 

i i , > t > kt eiUj _< n non y y, tciu 120. public network 1 50, or another 
autonomous system. 

As security server 125 attempts to build a trace of the path taken by the malicious packet, several 
paths may c m ■ ts result of hash collisions occurring in the participating rooters. When hash 
collisions occur, they act as false positives in the sense that security server 125 interprets the 
collision as an indication that the r,r-h; i mj oacj t h i been ohsej . ed Fortunately, the 
occurrences of hash collisions can be mitigated. One mechanism for reducing hash collisions is 
to compute lame hash values over the packets since the chances of collisions rise as the number 
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ii v ! e ha i t ! I her mechan educe fa) me- 

u h u on ions is for each sect ityj iter i u route l >9) t nj 1 \ n 

its uv>n unique ' i v the same collision will not occur in other security 

routers. 

A further mechanism for reduc ing ■. is to control the i he hash i biesjn the 

memories of participating routers. That is. rather than computing a single hash value and setting 
a, single bit for an observed packet a plural ity of hash v alues may be comput ed for each 
observed packet using several unique hash functions. This produces a con , m- md > i >g number of 
uni que hash \ slues r eacl ■ ■ - served packet. While this approach fills the hash table at a faster 
rate, the reduction in the number of hash collisions makes the tradeoff worthwhile in many 
instances. For example. Bloom Filters may be used to compute multiple hash values over a given 
packelin.orderto 
paths. 

When securir> f server 125 has determined an ingress point for the malicious packet, it may notify 
intruder detection system 124 that the ingress point for the malicious packet has been determined 
fact 650). Security server 125 may also take remedial actions (act 655). Often it will be desirable 
to have the participating router closest to the ingress point close off 1 he ingress path used by the 
malicious pac ket, As such, security server 125 ma y send a message to the respective 
+ 1 -i router mstrtK cl t ff die tngn. > tth using kn hi t 

Security server 125 may also archive copies of solum s &< eived and 

the like either locally or remotely. Furthermore, security server 125 ma y communicate 
information about source path identification attempts to devices at remote locations coupled to a 
network. For examp le, security server J . 2^' . may communi cate information to a network 
operations center, a redundant security server, or to a data analysis facility for post processing. 

EXEMPLARY PkUCi SSSV, f OR Df 1 ! I MINJ \M1ETI1FR A MALICIOUS PACKET 
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HAS BEEN OBSERVED 

FIG. 7 is a flowchart of exemplary processing for determining whether a -malicious packet, such 

L • L ! __ ! J_. _ en ol d cord gj i| i > 's t t ___(_ ; u j_ _h 

principiej.of the.inyentio 

300 implemerued witMn^a 126., or by other devices 

configured to Uace the paths taken in nuiu , u v L et- In othc s implementations, one or more 
of the described acts may be performed by other sy stems or device ^vhtmy systern .100. 

Processing i - u\ K -^in w Ivi; .H it;.. u -u\^i i J'» r e; e;y o/ : a .jty n U\ = n > ^.oin.v sy: \ cr 125 (act 
705). As described above, the query may include a TTL field. A TTL field may be employed 
because it provides an efficient, mechan gj poi ds only to 

f^eyanL.Qr.to 

* >. rk i iy and panic s <j routers i Ljj 

with expired TTL fields may be discarded. 

If the quer y includes a TTL field, security router 1 26 may determine if t he III, field in the q uery 
has expired (act 710), if the TIL field has expired, security router 126 may discard the query 
(act 71 5). if the TTL field has not expired, security router 126 may hash the malicious packet 

h m the query at each possible starti n g offset within a block (act 720). Sec urity 
route? 1 U . ■ ite multiple hash v ius code body < or m may 

a ppear at a m « I --" set within the packet that carries it (e.g., each copy may have an e-mail 
header attached that differs in length for each co py). 

Security router 126 may then determine whether any of the generated hash values match one of 
the recorded hash values in hash memory 320 (act 725 ). Sec urity router 126 may use each of the 
generated h a dues a c . Jo i ^ ,'ito hash memory 320. At each of the addresses, security 

„<<-„_> '6 may del th vvhethe; od c rf ! 1 i 2 " l 1 V that _ rjm ket with I i ■ 

'. tsh »!" v 1 i sen observci ! n >f the generated hash val match a has! lue in hash 
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memory 320, security router 126 does not forward the qu^ 
n egative response to securm ct 735) . 

1 mc u___l lL^ n » * i- hash \ ■ m hji x. ^ _» 

security, rom^^ 

the direction from which the query was received (act 740). Security roim 2 • - may also send a 
positive respo ns e to security server 125, indicating that the packet >, been - i < m ~4M 
The response may include the address of security router 126 and i nformation about observed 
packets that have passed through security router 1.26. 

CONCLUSION 

Systems and. methods w 

prevent, transmission of malicious packets, such as viruses and worms, and trace the propagation 
- pav kets through a network. 

The foregoing description of preferred embodiments of the present invention provides 
illustration and description, but is not intended to be exhaustive or to limit the invention to the 
precise form disclosed. Modifications and variations are possible in light of the above teachings 
on nay be acq uj red nom practice of the invention. 

For example, systems and methods have been described with regard to network-level devices. In 

ther inn n ions, tin 1 id medio i cn <> r_ e ma e , viii stan 
alone device at the input or output of a network link or at other p rotocol levels, such as in ..mail 
relay hosts (e.g.. Simple Mail Transfer Protocol. j'SMTP) servers). 

1 1 v ■ t t Li i ' .0 h ie^ard to the flowcharts of FIGS. 5-7, the order of 

the acts ma\ j c the invention. In 

addition, non-dependent, acts may be performed concurrently. 
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| ii < jmj ions oft ntur i *_j i crib \< | ok tl <* etj >rms v >i 

mote: fun i l oyic mas mciu.de ".tni* a-i i pecific integrated 

circuit or a field pr og; a in n , i i- d so ft v\ ate 

No element, act, or instruction used in the description of me.{>i.tsei i <tp\ -,k it ion should be 
construed as critical or essential to the invention unless explicitly described as such. Also., as 
used herein , the articl e "a" is intended to include one or more item s. Where only one item is 
intended, the term "one" or similar lang ua ge is used. The scope of the invention is defined by the 
claims and their equivalents. 
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i S BA D ^ VII M \D METHOD OR Pi 1 1 - \_ 

AND PREVENTING TRANSMISSION OF UNWANTED E-MAIL- HASH ■ B ASED SYSTEMS 
AND METHODS FOR DETECTIN G * PREVENTING, AND TRACING NETWORK 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates generally to network security and, more particularly, to systems 
and methods for detecting and/or preventing the transmission of unwanted e-mails m &fe kms 
packets, such as e-mails containing worms and viruses, including polymorphic worms and 

Descr i ption of Related Art 

Availability of low cost computers, high speed networking products, and readily available 
network connections has helped fuel the proliferation of the Internet. Tins proliferation has 
caused the Internet to become an essential tool for both the business community and private 
individuals. Dependence on the Internet arises, in part because the internet makes it possible for 
multitudes of users to access vast, amounts of information and perform remote transactions 
expeditiously and efficiently. Along with the rapid growth of the internet have come problems 

network and the ^heat <■ i : i mmen \l e-mail As the size of the internet continues to 
grow, so does the threat posed to users of tire Internet 

(0OO1| Many of the problems take the f orm of e- mail. Viruses and worms often masquerade 
i i 3 Uj] g 1 \ D ispe i m ecipi -licit mm t 
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e-m ail m; >! --pani," is anther hinder s, ,^ c fype of e-mail because >t; ask - both the time and 
resource • ' i *. •' ad recipient 

[6002] Exist ing t < iojaes rdet ng ru* s ^Qniis^ and.spam examine..eijch..e^Mi|. 
n te ssagt : i.n< lividi taii v In the case < > ! V i ru ses ; u idj vc >rm s jhj s typ i ■ a! h m eaos exa mining 

Utacl una r byt * i d i km mi vini.se ' ams (} bj it i up imzos 

de-archiving attached files), or simulating execution of the attachment in a ''safe'bcoiTrpartment 
and examining its behavior* Smijlajjy^ejg^ munt: > single e-mail 

message looking for heuristic traits commonly found in unsolicited commercial e-mail, such as 
an abundance of Uniform Honour v 1 oi jiois < I HI h i« > -eapital-iettei words, use 
of colored text or large foiUs, and the like, and then "score" the message based on the number 
and types of such. .tfaits found,. JB 

significant processing of each message, adding to the resource btuden im posed b y unwanted e- 
ma.il. Neither technique makes use of information collected from other recent messages. 
|0003J T hus, there is need for an efficient technique thai can quickly detect viruses, worms, 
and spam in e-mail messages arriving at e-mail servers, possibly by using information contained 
in ■multiple recent messages to detect unwanted mail more quickly and efficiently. 

[0004] thea e ' individuals. 

The ever-increasing number of computers, routers, and connections making up the Internet 

he^ts or eompnt e ^s; conn e ct e d to ih e n e werk: lu-faet; ■■■■e ach ■ rout e r; ■ witch- or eemput e r- 
eeen e et e d to the Internet may be a potential entry point from wh i ch a ma l icious mdivkhial-can 
launch an attack whil e remaining larg el y und e t e ct e d. Attacks carri e d out on th e internet often 

sw i tefe- ean be compromised and configured to place malicious packets onto the network. 
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pfegm^^b-'a»'a"¥tTO8"eg'weft» r that i s d e sign e d to aBHey4\etwork-^sei ; Sr^y"«etworig"8ef¥i€e 
t>v -tw e rfeadmg the network; or damage target computers <e.g.^y<ieleti^-^jes)-."A"¥miS--ifra 
pregram4to4ftfe €t S"a"e^ ^ ^ 

worm, oh the oth e r hand, is a program that can mak e copies of its e lf and spread itself through 
eeHHe€4 e 4-sys^ e BBy t^i«g"iif> t : esoiirces in effected computers or causing other damage. 

fe - f e e e m -ye ar -s r w 

wast e d- mi l hens of man-hours in ■ clean-^P ' Op e rattoHs^n ' COfporation S' and^^Offl e s-ali-over-tbe 
world. Famous examples include the "Melissa" e-mail virus and the "Code Red" worm 

Various defenses, such as e-mail fitters; anti-vkus programs, and firewall mechanisms; have 
b ee n e mploy e d against virus es and worm s , but with limit e d s ucce s s. Th e d e fens e s oft e n r e ly on 
eemputer-43ased-re« ogn 
propagat i on m e chanism!^ 

develop e d viruses and worms. Th e re is also a n e ed to trac e the path tak e n by a virus or worm. 
SUMM ARY OF THE INVENTION 

Systems and methods consistent with the present invention addre 1 < } and other needs 

by providing a new defense that detect* md m is the tosmt siot t n <>\ anted (and 
! ^ "fully, unwanted) e-s u") , , is e-tuails comaimnu » >j> 1 nouos, and spam. 
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{OOOSf ttttae-ks-majioiei t s-p aek- e tfr-sacfa as viruses and worms, at their fiao s t-eommon 

denemlnaterti^ 

s ys*et3*s r wi* e t^^ 

th e plac e at which it was initially inj e ct e d into the n e twork). 

in accordance with an aspect the principles of the invention as embodied and broadly described 
herein, a method for detecting s y s t e m-^ e t e e feHhe-transmission of potentially unwanted e-mail 

! gt g v f L 1L 11 hi h ! ng i_ ! g uui & ig ha 

- n one.: Oxmore.pora^ 
detenimung whether ^ 

ee*yespef w&ft g 4e- e ^^ the generated hash values 

match hash values associated with prior e-mail messages. The method may also include 
determining that one of the e-mail messages is a potentially unwanted e-mail message when one 
or more ot tl ge je at dh. k i souatcd u nh the e-mail message match one or more of 
the hash values associated with the prior e-mail messages. 

{00061 In accordance with another aspect of the invention, a mail server inc ludes one or 
more hash memories and a hash processor. The one or more hash memri . • rej > a figured to 
store count values associated with hash values. The hash proce s- >r is u receive an e- 

nutii mi igt t t n n _pj ti oj f h v m i i 1 ig g m v ■ i nd 

increment the count values corresponding to the generated hash values. The hash processor is 

iuuhei ion k o den ni] dier e y.u\ m ge is a potentially i >ful e-mail 

message based on the incremented count values. 

{00071 In accordance with vet another aspect of the invention, a method for detecting 
UanMiuv-v i k J i-nutl messages pun ided i he method iik hides receiving e-mail 

messages and detecting unwanted e-mail messages of the received e-mail messages based on 
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hashes of previou ;iy receix id e-mail messages, where multiple hashes are performed on each of 
the e-mail messages. 

[6008] Inaccp Rtonie ^ ah a , tiorj a n tctM4.lb£.deMQt|.ng 

transn itssion of potential! ; umvant* -d e-.n lad n tesj a ges is pros aded. I he me tht td inclut Ses 

i ice ins > e-mail messag geru i ru ig ia ] iku bio ks of ti > i , » ,. s n , 

the blocks include at least two of a main text portion, an attachment portion, . and a header portion 
of the e-mail message; determining whether the generated hash values match hash values 
associated with prior e-mail messages; and determinin g that t he e-mail message is a potent iaiiy 
unwanted v- i- e v> hen one or more of the generated ha -d al • . - i - ■■ -1 with the e- 

i one oi i i ej h ilues tssociated with ihe prioi j < 

[0009| In accordance with another aspect of the invention, a mail server in a network of 
cooperating n I e- - .-v provided. The mad -e^ci mJudes ,■ , sh mories and a 

hashjarocessoL The one.or.more.hagh memo 

to hash values corresponding to previously-observed e-mails. The hash processor is configured 
to receive at least some of the hash values from another one or more of the cooperating mail 
servers and store information relating to the at least some of the hash values in at least one of the 
one or more hash memories. The hash processor is further configured to receive an e-mail 
message, hash one or more portions of the received e-mail message to generate hash values, 
determine wl es m .'ih the hash values corresponding to previously- 

obgejrv ed e-mails, and identify the received e-mail message as a potentially unwanted e-mail 

nj ge when one oi m -\ he generated hash values a < »i tl e < 1 tai 

message match one or more of the hash values corresponding to previously-observed e-mails. 
{601 0] k th thcr aspect of die ii \ mail s prov ided. Th 

maiLser\ et •■• dud s -ne or more hash memories and a hash processor. The one or more hash 

memories is intred to store count values associated with hash values. The hash 

puKc^oi \ - 1 , i r c 't i i e _ . a h one nt more portions of the received e- 

mail messages to geneuu -n d ■ J a ,>w_ cment the count sai a - ■ ■■i. e- pondin g to the 
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- _it dha I 'u " em ned t ) ut values v and genci -^ -i-n » i m scores tor the 

c e 1 messag >a ed m the in emt h mi ah t 
[0011] jnaccordMce with.a..fa <\ - I ), a .method.ior. presenting 

m 1 u m | ii l m if i i et] c mg u J | 

n .sagei^ 5 < 1 i j f iJue er po i t th mail m _ u mail n ;ssag< ? 

being received; and incrementally detemiining whether the g enerated hash va lues match hash 
values associated with prior e-mail messages. The method further includes generating a 
suspicion score for the e-mail message based on the incremental determining; and rejecting tire 
e-mail message when the suspicion score of the e-mail message is above a threshold, 

(00121 pftof-pae - k -e te - .-T - h e-e yst e m - fflfty - dcteiTOine that one of the pockets is a potentially 

malicious -paeket^ 

t h ^ Hi sh-v a^ ^ ^ 

According to anoth e r implem e ntation consistent with th e pr e s e nt invention, a system for 
hafflp e rmg-trflmmt s sion of a potentially malicious packet is disclosed. The system includes 
mean s for r e ceiving a pack e t; means for generating on e or more ha s h valu e s from the packet; 

te ast-en e -of- fe ^^^^ 
pi : e4eiewn*ned^^ 

wh e n the packet is determin e d to b e a pot e ntially malicious packet- 
According to yet another impl e m e ntation con s ist e nt with th e pres e nt invention, a m e thod for 
storing- imsh-vakfes-tfoe^ 
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peftm»*iy^ 

pmm thdly mafei ens packet was one of the r e c e i v e d packets w ben one -or- mere -of the gen e rat e d 
BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying draw ings, which are incorporated in and constitute a part of this 
specification, illustrate the invention and, together with the description, explain the invention. In 
the drawings, 

FIG. 1 is a diagram of a system in which, systems and methods consistent with the present 
invention may he implemented; 

FIG. 2 is an exemplary diagram of the e-mail a security server of FIG. 1 according to an 
implementation consistent with the principles of the invention; 

FIG Hs an exemplary fu etiona liagrai »i Ftg 2 p a ek e t - d e w4o ft 

legie according to an implementation consistent with the principles of the invention; 
10013} Fig. 4 is an exemplary diagram of the hash processing block of Fig. 

FIGS. 4A and 4B illustrat e two possibl e data structur e s stored within th e hash memory of FIG. 3 

ling t m in lenientation k-if«plei^> e n4atiorvs consistent- with the principles of the 
invention; and 

10014} Figs. 5A-5E are flowcharts 

j^9r44»«4leweteH)f exemplary processing for detecting and/or preventing transmission of an 
mm an tec e-ma i ..i^ca-ma-Mc-iora- pi^ket, such as an e-mail containing a virus or worm. 
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1 1 i ; ma^-t>t4H>^4i--»-a-t>4i-n;-> k 



PIG- -44s a po i . yniof phjc fl owehart-ef exemplary proc e ssing for identifying 

8 -nfrtwerk by a malicious packet:, such as a virus or worm., OL.an..uiisoljcried commercial e-mail, 

according to an implementation consistent with the principles of the inventionrand 

FIG. 7 is a flowchart of exemplary proc e ssing for d e t e rmining wh e ther a malicions packet, such 



DETAILED DESCRIPTION 

The following detailed description of the invention refers to the accompanying drawings. The 
same reference numbers in different drawings may identify the same or similar elements. Also, 
the following detailed description does not limit the invention. Instead, the scope of the invention 
is defined by the appended claims and equivalents. 

Systems and methods consistent with the present invention provide yims,,worm, and unsolicited 
e-mail detection and/or prevention in e-mail servers. Placing these features in e-mail servers 
punuks a rru ml g new a ,, a y j Khufrng »he abs m to align hash blocks to crucial 
boundaries found in e-mail messages and eliminate certain counter-measures by the attacker, 
such as using small Internet Protocol (IP) fragments to limit the detectable content in each 
packet. It also allows these features to relate e-mail header fields with the potendaily-harmfu] 
egl entofi >sa ge (t rally an "attachment"), and decode common file-packing and 
encoding io t , \ iun or worm undetectable by the packet-b ased 

technjiju (e\g,., ' / p files" ) 

(0O15J pi u j thes i} itu t vithi ,<> ! e nja the vibtlm to daeu scpiicated 

' ' ' ' ' ■ - U t e lame quantities of traffic are prese nt is obtained. EH' 



i, has been obs e rv e d - according 



implet 



■w i th-th e 
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relating many othei wj se^nd t pendent mes sa ges a; i J h n > i ■ ig con imoj i fn >; tlie e-mail server 
ma' detect unknown is t eh is known, a ruses md vvorms I he t feature may also be applied 
to detect potential unsolicited commercial e-mail ("spam"). 

[0016} I x ! „ n got Int ernes Sea . in Pn \ lets i SSI t t mdiiou e- 

maii messages a moi _ - gle set vj: ten ^ fl an rms at 5 v u 

network, a substantial fraction of this e-mai l may actually be traffic generated b y the vi ms or 
worm. Thus, an e-mail server may have dozens to thousands of examples of a single e-mail- 
borne virus pass through it in a day, offering an excellent op portunity to determine the 
relationships between e-mail messages and detect replicated content (a feature that is indicative 

fi us/ wot propag io 'id i ,n u ig ot cr, n m dm t if tic (si ] ' 'u Yom 

1 kimau , . Hngjj 

[0O17J Systems and methods consistent with the principles of the invention provide 
mechanisms to detect and stop e-mail-borne viruses and worms before the addressed user 
receives th em, in an em u< rnj - - ,u : > the vims is still inet i. Cm rem e-mail servers do not 
n • ' Lib execute any code in the e-mail being transported, so they are not usually subject to 
virus/worm infections from the content of the e-mails they process - though, they may be subject 
to infection via other forms of attack. 

I'0018j Besides e-mail-borne viruses and worms, another common problem found in e-mail is 
mass -e-maii ing of unsolicited commercial e-maih colloquially referred to as "spam," It is 
estimated th of ah e-mail messages now received for delivery by major ISP 

e-mail servers is spam. 

[0019} I s of networl til services tc desirous of mechanisms to block e-mail 

containing viruses or worms sons tiad uu heir machines the vin on i nu> ea^th 

do luum 1x1 | i tHze presence). I ire also desirous i ihdinsms to block 
unsolicited commercial e-maii that consumes then- tun e and resources. 

[0020} Many commercia l e-maii services put a limit on each us er's e- mail accumulating at 

tbe.server,^ m ; mad line If too much e-mail aimes 

between times when the user reads hi'- ymnd additional e-mail js either "bounced" (i.e., 
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returned to the n ndei ad suxu ) or c\ cn smiph disc ai d ed both of whic h events ca n 

Miousi !C i v it v I ' Because ti er has no control o arriving e-mail due to t 
maii-borne .yisi^o- - a u.rms, or spam, it is a relatively common t uu iii < i jjh an tn u 

| ! > <! t ' ■ i >s i j v. ijv.l li I 'I S S ( i i i ! 

mail -home - - t ^ nason to limit the size of their messages. 

As a result these messages are often much larger than legitimate e-m ,r i _ t herein 
' ^ 1 ^ to the user by overflowing the per- user e-mail 

quota. 

|0021| Users are not the only group inconvenienced by spam and e- n - u viruses and 

ns. B i yj 1 l i \ a i mail c t u ; I . t i tial f cm • en a 

ef.^-m§tl.imffiQ..m...the f\\ c ^ s i i pdsof timeJSPs.^ 

resources to handle a peak e-mail load tha t would oiheisvne be ,;S « ; trge 1 i ns ratio of 
umvan.ted-to-Ieaitimate e-mail traffic appears to be growing daily. Systems and methods 
consistent with the principles of roe invention provide mechanisms to detect and discard 
unwanted e-mail in network e-mail servers. 

[0O22J /or prevent t h e transmission of malicious packets and trac e th e propagation of th e 
malicious pa ckets throug h a n etwork. Malicious packets, as used lierein. may include viruses^ 
worms, and other types of data w i th duplicat e d cont e nt, such a s ill e gal mass e -mail ( e .g., spam), 
tfe«t-we* e p eat- ^ 

1 ^ -hash e d-te--^ 

»Hft4? e-- ha^^ 

a packet may be hashed. 

EXEMPLARY SYSTEM CONFIGURATION 

FIG. 1 is a diagram of an exemplary system 100 in which systems and methods consistent with 



.10 



In re, U.S. 10/654,771 Changes made to 10/251,403 to create 

ClPapp 10/654,771 

the present invention may be implemented. System 100 includes mail ciient s autonomoos 
systems (ASs) 1 10[[-140]] connected to a mail server 120 via a ff public 11 network 130 li(PN) 
150.]] Connections made in system 1.00 maybe via wired, wireless, and/or optical 
communication path-. While FIG 1 show thro maj.l.d mi 10 ; d ^ir-aufoflometts- s ystems 
eeaa e eted-to -a single mail server i 20p ublie-«etwoyk, there can be more or fewer clients and 
sen'ers s¥Steffls - -and-^wot : fea in other implementations consistent with the principles of the 
invention. 

|0023| Network 1 30 may facilitate communication b etween m ail clients i 10 and mail server 
120. Typically. [[Public]] network .1 30 [[ 1 50]] may include a collection of network devices, such 
as routers [[(R1-R5)]] or switches that transfer data between mail client; I \t 1 mil rver 
120. maeEK?«KHttH s ys* e i^^ In an implementation 

consistent with the present invention, [[publ ic [[netw ork 130 may take j j" 1 50 takes]] the form of a 
wide area network, a local area network, an intranet, the Internet, an intranet; a public telephone 
network, a different type of network, or a combination of networks, 

j'0024'j Mail clients 1 10 may include personal computers, laptops, personal digital assistants, 

<„v |x j 3 reh devices tha ± able of Ut ^ • k_ ' 1> il serv< 20 t 

receive e-mails, in another implementation, clients 1 1 0 may include software operating upon 
one of these devices. Client 1 10 may present e-mails to a user via a graphical user interface. 
[9025] Mail server 120 may include a computer or another device that is capable of 

k 1 1 N ces for mail clients IIP. In another implementation, server 120 may 
include i n- :nm upon one of these devices. 

[9026] Fig. wid c area netwo rk (WA N), or t h e tiko . 

Afr fttt fc>neffl8ttfr » ¥S4^ ^ 
dem a kve a iv e -^ 

may includ e comput e rs or other types of communication devices (ref e rred to as "hosts") that 
e < »nnftel4^f)nM^ 
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A«toaoffle»s-s^tem 1 1 0 . for e xample, inel ud e s ho s t s f H) 1 1 1 1 I -3 conn e ct e d -in -a --LAN 

ee^%«fatte«-."}'fos4»"1"U~t43 connect to public network 150 via an intruder ■detee**ofH»yste«ft 

premis e used by an intrud e r det e ction syst e m is that malicious network traffic will have a 



■ t-wl e -s 



s4«ke«»d4Ky&e4 



s ystem-144 may take 



1 1 0: Wh e n a suspicious patt e nr or 
remedial action, or it can instruct a border t < 



s d e t e ct e d^ iiuruder defer 



r firewall to modify operation to address the 



the malicious traffic, dBear - dmg - packcts coming from a particiato s 
pack e ts addr es sed to a particular d e stination. 



e address, or discarding 



Awofummfr system 120 contain s di# e r e nt d ^ ^ ^ 1 10. The se -d e v i e e ; 

n*a-lkdmre--p^ 

FIG. I shows on l y autonomous system 120 as containing th e se d e vic e s, oth e r autonomous 
systemM«cluding autonomous system 1 1 0, may include them. 



rout e rs (SRI 1 SR1 1) 126 - 12 9 . Hoots 12! 123 m a y inclu de com pu ters or oth e r typ os of 

c-emnHink-atkM-d e ^ 

s yste»4^ - m^ 
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^ettf *t y -«e w e r - 4^ 

pegfeymS'-'Se»r€e-pafe^<leBtiifiGatie« wh e n a malicious packet is d e tected -by intrud e r detee tien 

424n-« : eH3&e j iW3^^ 

the pr e sent invention: 



IN) 2 is an exen , ■ -p urity se v e r 1 35 according to i 

implementation consistent, with the principles of the invention. Server 120ff 
configuration of s e curity se rv e r 1 25 is illu s trat e d - in ■ PIG. 2, oth e r eon figurations ar e - possible; 



Seear% - 6 e Fv e F - 4 - SS may include bus 210. [[a ]]processor 320IT202TL mam memory 230[[204]], 
read only memory (ROM) 240JJ206J], storage device 250, input device 260. output device 
27 0208; bos 2 1 display 21 2, k e yboard 2 1 4. cursor control 246 , and. communication interface 
280. Bus 210 permits communication among the com ponents of server 120. 
[0027J [[2 18. ]] Processor 220[[202]] may include any type of conventional processor or 
microprocessor that processin^ devke tbat- interprets and executes instructions. 

Main memory 230[[204j'j may include a random access memory (RAM) or another a -similar type 
of dynamic storage device that stor es . Main memory 204 may store information and instructions 
for execution to be execut e d by processor 220. 202. Main m e mory 304 may also be used for 

proe ess of-202 RO 1 'AO 2 6|] may include a conventional ROM device or another type of 
static storage dev ice that stores jjstore]] static information and instructions for use by processor 
220. St- ' i mj ol d njr medium and its 

ponding drive. 
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|G02S| Input device 260 may include one or more conventional mechanisms that permit an 

operato to nj ' iv i i s j >. such as a keyboan t p oum a pen, voice 

recogn it ioti .and/or bjome j dc .mechajij>n cic,. J. juini u . ' . ; m u\ include one or more 

conventi onal media r o m j l oj r. su d.ispko titer, a 

pan of speakers., etc. Communication interface 280 mav include any transceiver-like mechanism 
that enables server 120 to communicate with other devi<;e§..Md/of..$ystems,...Fo:f.e:xa«}p?.e, 
conrniumeation interface 280 may include .mechanisms for eonun i m .\ wh another device 
or system via a network, such as network 1 30. 

100291 As will be described in detail below, server 120, consistent with the present 
invention, provides e-mail services to clients i 10. while detecting unwanted e-mails and or 
preventing unwanted e-mails from reaching clients j 10. Server 120 202. It will be appreciated 
fea^QM 306 may b e ^ 

2 - 0 § r «l se -- f e fer - r e ^ or op&saka e &ft 

ar*44hetf~eer-re^^ 

Bus 210 may i n clude a set of hard ware lines (conductors, optical fibers, or the like) that allow 
for data transfer among th e compon e nts of s e curity se rv e r 125. Displ a y d e vic e 2 1 2 may b e a 

alt e r-n a ti-v e -een^^^ ^ 
mk^?l^# - ^ 
s e curity s e nd e r 125. 

Gommunieation int e rfaee 21 8 enabl e s security s e rv e r 125 to communicate with oth e r 
nm y 4ne-tede- a 4Be4em ^ 
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to --e nay « »e^^ 
networks- Cenmnane^^^ ^ 

2444oda - eiiitate operator or machine remote control and oomm«aicQtion-wkfe--^H«ty-sefvw 
4-2§r 



As will b e described in d e tail b e low, security s e rv e r 125 may p e rform sourc e path klentification 
and/er f i- e ven tio n me asures for a malicious packet that entered autonomous system 120. S e curity 
serve-; ay pt i"?m these tasks ftmct i on- ji m Kspomo to pi ocelot 2 ^> jj"trij v,\eamnu 

sequences of instructions contained in, for example, me i >n ] { >> ttt 4 i instructions 

may he read into memory 2301120411 from another computer-readable medium, such as storage 
device 250 or a carrier wave f [208}|. or from another device coup l ed to bus 210 orcenpkdvia 
communication interface 280. 

{0030| Execution of the sequences of instructions contained in memory 230 may cause 
processor 220 to perform process es that will be described later. [[2 18)). 

Alternatively, hardwired circuitry may be used in place of or in combination with software 
instructions to implement processt , d u- Mem u tth the present invention. Thus, processes 
performed by server 120 are not limited to any specific combination of hardware circuitry and 
software. 

(0031} i _ - s emplary fitnetiot t i ?0 according to an 

implementation consistent with the principles of the invention . Server 120 may include a Simple 
Mail 'han»k' Pro , ■ < S\f 1 P) block s 1 0. a Post Office Protocol (POP) block 320. an Internet 
Message Access Protocoi (IMAP) block 330, and a hash processing block 340. 
[9032| SM TP block 3 1 0 may permit mail server 120 to communicate with other mail servers 

connected to network 1 30 or another net k -M H ignec 'aui> and reliably 

transfer e-mail across networks. SMTP d efines the interaction between mail servers to facilitate 
the transfe .' \en uhen the -mn 1 ^w-'maeti^s-^f-^eeim^-sefv^-Sgv-yog 
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implemented on different types of computers or running 



cl i m ipa ] n y 

[0033} POP.b|ock 320 may.permU^ 

POP block 320 may fee designed jo aiways.receive incoming e-ma|L...POP.Moc.k.32Q jtiay.thgri 
boUi e-mail for maii clients 1 10 until ma V •, ' t'K«> t 'i j_< _. d ^ i > u ) them. 
[0034} i\t \P ''i i na £ >^ ide anothei meeh n n by wh i iN UP i m 

retrieve e-mail from maii set ver i 20. IMAP block 330 may permit mail clients 1 10 to access 
remote e-mail as if the e-mail was local to mail clients 110. 

[0035} Hash processing block 340 may interact with SMTP block 3 10. POP block 320, 
and/or IMAP block 330 to detect arid prevent transmission of unwanted e-mail, such as e-mails 
containing viruses or worms and unsolicited commercial e-mail (spam). 
[0036} Fig. 4 is an exemplary diagram of hash processing block 340 according to an 
im plementation consistent with the principles of the invention. Hash processing block 340 may 
include ha --' ■ . ros e * 1 d one oi more hash memories 420. Hash processor 410 may 

include a co nventional processor J [ j in]] an application specific integrated circuit (ASIC), a field- 
programmable gate array (FPG.A), or some other type of device^ie4lfeere^H>kme^« 

R e turning to FIG. 1 , s e curity rout e rs 1 36- 129 may include network devices, such as rout e rs, that 

nmy-d e *e^t--afi^%r^ 

i deH t- i - fK - ^ 

s e«nyity--r-ente*s4-2-3-42^ 

F-lG----3-- fe -an-- e^ mpfe^^^ 
e - ens is ten^ 

within a d e vice that taps one or mor e bidir e ctional links of a rout e r, such as security routers 1 26- 
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sttejHKrsecority routers436-I2j>. In the di s cussion tfa 



Packet d c 



tee-tieft- l egk-300-may include hash processor 3 1 0 and hash me 



320, Hash 



tb i nat - iea - ef 



these that generates one or more representation s for f f of ]]eaeh received e-mai I f [packet]] and 
records the e n aj] [p ( kt-t]] representations in hash memory 420. 
10037} An e-mailff320.11 

[[A packet]] rep? - i ! it ion will likeK not be a cop\ of the entire t -mail packi {.]] nut uithef it 
mavf fwillH include a portion of the e-mail ff packet]] or some unique value representative of the 
e-mail For pa€&etrBeean se -m^ 

contrast^ storing a value r e pr e sentativ e of the contents of a pack e t uses memory in a much mor e 
effiedi e « t-- t»a«n e f - . --- By - wa - y - of example, *f - inei> i mftg -- p^ - k^ 

b*t%-a fixed width number may be computed across portions of the e-mailfix«d-sk«d- b 4 eek s 
ntaki ng u p tb e ^ont e nHe r - payloa^Vef a p a ek et in a manner that allows the entire & 
mai1[[packet]] to be identified. To further illustrate the use of representations, a 32-bit hash 
value, or digest, may be computed across portio n s fc* e d -wge d44eek $ of each e-mail [[packet.]] 
Then, the hash value may be stored in hash memory 420 [[320 ]]or may be used as an index, or 
address, into hash memory 420. [[320. jj Using the hash value, or an index derived therefrom, 
results in efficient use of hash memory 420[[320]] while still allowing the content of each e^ 
mail [[packet]] passing thro 1 \ \ ci I jO pack e t - de te c t wu - lo - gic^tn) to be identified. 

Systems and methods consistent with the present invention may use any storage scheme that 
records information about one or mor e portions of each e-mail [[packct]1 in a space-effi cient 
fashion that can defmitiveiv detctmme fa portion ( i nail[[packetj] has not been observed, 
and that can respond positively (i.e., in a predictable way) when a portion of an e-maij [[packet]] 
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has been observed. Although systems and methods consistent with the present invention can use 

virtually any technique for deriving repress tat s of portion > > na i la p ae - kete; fe^ - bfr ev rty , tl ?e 

remaining discussion will use hash values as exemplary representations of portions of e-mai l s 
receive IJy nun I server 120.. 

(0038] ! i|t' « i lJ pi t ] ' 11 L ttion, S __*v_i 0 

may hash one or more portions of a received e-mail to p roduce a hash value u sed to facilitate 
hash-based detection. paA^s4^4ng-f)assed through a participating ^eaiefr: 

payk>ad-ieldfo For example, hash processor 4J0[ [3 i0]] 

may hash ong..er...mQm.qfe^-iH^<^»'Ve--64"byte b leek-fo ll owing the majn.texiMfein.the. 
message body, am. : .- ■■■ niv and one or more headei fields t ed fields 

(e.g.. "From;," "Sender:." "Reply-To:." "Return- Path:," and "Error-To:"). Mash processor 41 0 
may perform one or more hashes on each of the e-mail portions using the same or different hash 
functions, 

[0O39J . As described in more detail below, hash processor 410 j j 3 1 0] j may use the hash 
results of the hash operation to recognize duplicate occurrences of e-mails pocket content and 
raise a warning if the duplicate e-mail occurrences arrive within it d e t e cts packets with r e plicated 
eeatea Hw#H«"a short period of time and raise their level of suspic ion above some threshold. It. 
Hash-pr-ooes s oi'-340 may also be possible to use the hash results for tracing the path of an 
in toted i tn uja-mt^e4tHtvftaek-et through the network. 

{0040} Bach [[The jjhash value may be determined by taking an input block of dat ar s »ehra » 

a 04 byte block of a pack e t; and processing it to obtain a numerical value that represents the 

given input data. Suitable hash functions are readily known in the art and will not be discussed in 
detail herein. Examples of hash functions include the Cyclic Redundancy Check (CRC) and 
Message Digest 5 (MD5). 

The resulting hash value, also referred to as a mes hide j j is j]a 
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fixed length value. The hash value may serve ffservesll as a signature for the data over which it 
was computed. PfcHN * * - aB*f^ ^ 
th eir -content? 

The hash value essentially acts as a fingerprint identifying the input bloek of data over which it 
was computed. Unlike fingerprints, however, there is a chance that two very different pieces of 
data will hash to the same value, resulting in a hash collision. An acceptable hash function 
should provide a good distribution of values over a variety of data inputs in order to prevent 
these collisions. Because collisions occur when different input blocks result in the same hash 
value an ambiguits ma\ arise when attempting to associate a rest it with a particular input. 

flash processor 4K>[[3 10]] may store a representation of each enrjaii[[paeket]] it observes in 
hash raemon 420 jj S20 ] j Hash processor 4 1 Of f 3 1 0| f may store the actual hash values as the §z 
mail f [packet]| representations or it may use other techniques for minimizing storage 
requirements associated with retaining hash values and other information associated therewith. A 
technique for minimizing storage requirements may use one or more arrays o b - ifrflrmy-o r Bloom 
filters^ for storing hash values. 

Rather than storing the actual hash value, which can typically be on the order of 32 bits or more 
in length, hash processor 4 1011 31 Of) may use the hash value as an index for addressing an[[a bit 
jjarray within hash memory 420 : [[320.]] In other words, when hash processor 410; 13 1 0]] 
generates a hash value for a portiou fixed -sized block of an e-mail a-paek e i the hash value serves 
as the address location into the [[bit ]]array. At the address corresponding to the hash value, a 
count \ahk may bj [nc c'^^alteJ ^m e - ■ of-- ■ nH»r e -^h ^■ n^t¥4 : ^e■set at the respective storage location 
thus indicating thai a particular hash value, and hence a particular e-mail portion d atap aeket 
e^uten* , has been seen by hash processor 41 0. In out miplememattoty, the count value is 
tssociated w t] 'm lountet u uh a maximum value t l-ot -o.k- .n J:^ » 10 1 o> y-H!ii|-le 
n *i ng - a - ^5 - 4:> i t4^h - ^^ ^ ^ 
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a reay-Stor-ingot^ 

h i te4£w%i ; >fed * fe ^^ Whi e < i it]] arrays are desci ihed by 

way of example, ii will be appreciated by o bviou s to ^ those skilled in the relevant art, that other 
storage techniques may be employed without departing from the spirit of the invention. 
{0041 j Hash memory I ty stoi » < p , " ■ m that is edt ermnn the exeiali 
suspiciousness of an e-mail message. For example, the count value (described above) may be 

1 tpared I t threshoh n h uspiuton count for the e-mail may be incremented if the 
threshold is exceeded. Hence, there may be a direct relationsh ip betwee n the count value and the 
suspicion count, and it may be possible for the two values to be the same. The larger the 

i c x the e impoi > ou 1 be consid i in del < di 

suspiciousness of the.packet Alternatiyely.. the. guspicion.coum.ean.be 

function" with ..values from this or other h ash block s in the same message in order to determine 

whether the message should be considered suspicious. 

{'0042'J It is not enough, however, for hash memory 420 to simply identify that an e-mail 
contains content that has been seen recently. There are many legitimate sources (e.g.. e-mail list 
servers) that produce multiple copies of the same menage, addressed to multiple tct >p 
Similarly, individual users often e-mail messages to a group of people and, thus, multip le copies 
might be seen if several recipients happen to receive their maii from the same server. Abo, 
peopiejMte^ of recc ed messag e '« riends or co-workers. 

J0043J In addition, virus/worm authors typically try to minimize the replicated content in 
each copy < not be detected _ g i letection 

u hj i i ' _j ng fixed set i •■ ! ^ n a 1 s v in? m. Thes 

unable viri i it olyroorpihc, and the g n 

Iherecog nizabiiitA of the \ i rus or wor m by scrambling each cop\ in adi$ erem way,.,,,FoMhe 
virus or worm to remain viable., however, a small part of .it can be mutable in only a relatively 
small number of ways, because some of its code must be nnokdi ible by the victim's 

computer . and that limits f i mt \ meal initial code 

part. 
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|0044| In order to accomplish the proper classification of various types of legitimate and 
unwanted e-mail messages, multiple hash memories 420 can be employed, with separate hash 
mgmed^..420.Mag.U^ifo.r..gpei. >part 1 » Nta oda.td e-niail^^ 

d fjj ? 1 ! ] et irk ? can t b nl i J t n \ d i. g" i 

ficati i > ! i iLi 1 uv. is . .!_ t 

possibly estimate the probability that it belongs to a parriculai:c.l.ass..Qf Mffig. v >wc.fa.a$..a 
virus/worm message, spam, e-mail list, message, normal user-to-user message, 
100451 For e-mail following the Internet mail standard RFC 822 (and its various ..extensions), 
hashing of certain individual e-mail header fields into field-specific hash memories 420 may be 
useful. Am on g the hea der fields for which this may be helpful are: ( 1 ) varum-- t . i tied 
fields, such as "From:", "Sender:". "Reply-To:", "Return-Path:" and "Error- To:"; (2) the "To:" 
Meld (often a 1 ixed - aluc foi a mailing list, frequent iy missing or idiosyncratic in spam 
messages): and (3 ) the las t few "Receive d:" headers (i.e.. the earliest ones, since they are 
normally added at the top of the message), excludi n g a.i \ ■ >H i ■ r me^an '.p data ll may also 
be useful to hash a combination of the "From:" Held and the e-mail address of the recipient 
(transferred as part of the SMTP mail-transfer protocol, and not necessarily found in the message 
itself). 

{0046J Any or all of hash memories 420 may be pre-loaded with knowledge of known good 
or bad traffic. For example, known viruses and spam content (e.g.. the infamous "Craig 
Shergold letter" or many pyramid swindle letters) can be pre-hashed into the relevant hash 
memories -i,'* u-aJ <>< ; yt ioiU alK ;eire-;n d in the memory as part of a periodic "cleaning" 
process described below Also, known ley mate m » m hMs Opm 

leuitimau .. v -".:aii list servers can be added to a "I u»m ' h mem 120 that pa s 
v ,i k i. r e} mi! i 

[0047J Overtime hash nemorit 420 B.^morv-330 may fill up and the possibility of 

n \ iiig f» v - a - wf H - i«g an existing count fjindex jj value increases. The risk of overflowing a 
coun t overwriting an index value may be reduced if the counter arra ys are bit array is periodically 
flushed to other storage media, such as a magnetic disk drive, optical media, solid state drive, or 
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the like. Alternatively, the counter arrays bit- a fi-ay may be slowly and incrementally erased. To 
facilitate this, a time-table may be established for fhishin g eountei • the hit 

array. If desired, the Tlushing -erasm g cycle can be reduced by computing hash values only for a 
subset of the e-mails .received by niasi;,^>e; !. 20. packets passkg-tferough-t-te-fo^rtey:- While this 
approach reduces the flushing/erasing cycle, it increases the possibility that a target e; 
mail [[packet]] may be missed (i.e., a hash value is not computed over a portion of it). 
100481 Non-zero storage locations within hash memories 420 may be decremented 
periodically rather than being erased. This may ensure that the "random noise" from normal e- 
rnail traffic would not remain in a count ct ura; nJofimtely, Replicated traffic (e.g., e-mails 
containing a virus/worm that are propagating repeatedl y across the network), however, would 
normajlycan^ 
level. 

[00491 One way to de crement the count values in the counter array fairly is to keep a total 
count, for each hash memory 420. of every time one of the count values is increme nted. After 
ie threshold value iprobab m thi millions }, for every time a count 
I K anoti _ 2 ,\ i cmomed t )ne way to 

pick the count value to decremen t is to keep a counter, as a decrement pointer, that simply 
iterates through the storage locations sequentially. Every time a decrement operation is 
performed, the following may done; (a) examine the candidate count \ a; c t . cremented 
and if non-zero, decrement, it and increment the decrement pointer to th e text stoj age location: 

b? if il cou m val ero, then examine each sequemia >wi 

location until a non-zero count value is found, decrement that count value, and advance the 
I cerement porntt jj .mowing si >eafiort 

100501 It pray be ;.importanj[ to a \ o;d dee;er u .oamets below zerp^jvjyiejigt 

biasing decrements unfairly. Because it may be assumed that the hash is random, this technique 
should not (as oi an\ pai'iv i ^ Ji of them before starting over. 

This techn i que n o c \ i tixed total count 
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population across all of the storage location s, re presenting the most recent history of traffic, and 

i bj _ * t_ s in behavior a volume of traff se ti 

[0051] A variation of this technkjue ma y include landomh flee ting a count value to 

decremen t rather than proce ssin g them e yei icaih In this variation., if the chosen count value is 

' c hen an ofh u|d_bej <^ i u > t > c + 1 t in th ^ sragi, 

locations .following the initiaily-chosen one could be ex amined in series , until a ,noo-zero count 
value is found. 



■FI:GS : : . -4A" ... ^ 

32 0 in impl e m e ntation s con s i s t e nt with the p r inciples of the invention. As sho w n in F IG. 4 V 
{rash-memory-^ 

g e nerat e d by hash proc es sor 310, 



fe4teatef4*ei441£H^ th e 

As shown in FIG. -IB, hash memory 320 may store additional ■•iftfoi^at*e»"i^«tiag4e--a-^aefeet-: 
For exampl e , hash memory 320 may includ e link identifier (ID) fields 122 and statuo fi e lds 121 
Li»kdl>-fel4-^ 



may-b e y e fregran»^ 
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meiv«4'-pa6keK^iwt"0»ly"tbfr-feaslv valu es of previously observed packets- km also- to -hash 
values-of known malicious packets. 

Ia-y e tH»^tl^4mp{er*j^ 

may b e preprogramm e d to stor e sourc e addr e sses of known sourc e s of legitimate -duplicated 
eotrteaVsw?h--as-j>aefeet»-ff0fH-i>--mttteeftst server, a popular page on a web server, anoinptrt-from 
a mailing li^ ^^ 

duplicated cont e nt; 

EXEMPLARY PROCESSING FOR. UNWANTED E-MA I LM ALK^iS-PAGK - ET 
DETECTION /P R E VENT ION 
[0052J Figs. 5A-5B are flowcharts 

PIGr^4»^- flewel>art of exemplary processing for detecting and/or preventing transmission of 
unwanted e-inaii a-malieions paeket, such as an e-mail containing a virus or worm , including a 
polymotpl . i i ? uoni). or an unsolicited commercial e-mail (spam) , according to an 
implementation consistent with the principles of the invention. The processing of Figs. 5 A- SB 
wil l be des arms of a serie s of acts that may M 

implementations consistent with the princi ples of the invention, some of the a cts ma y be optional 

and/or performed in an order different than that described. In other implementations., different 

acts imn be =>uh J '.as or added to the process. 

{00531 F-tCiv-S-may be performed by packet deteette^4ogk-"&)9"W^^ 

router, such as security router 126, or other d e vic es configur e d to detect and/or prevent 
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Processing ma.) begin when h h u cess. [10 Fk I ? ^fcet4etee^ir4epfr40e~receives, or 
otherwise observes, an e-mail, message a pack e t { act 502? (Fig. 5 A). [[505}.]] Hash processor 
()[[.>] 5 i i n i > I > h i_ i h (act.504 

When hashing the main text, hash processor 410 may perfom ig ener - a - te one or more conventional 
hashes covering one or more portions, or all, of the main text. For example, hash processor 4 1 0 
may perform hash functions on fixed or variable hash valu e s by hashing successive, fixed- sized 
blocks of the main text, it may be beneficial for hash processor 410 to perform multiple hashes 

j i Liisin j i v 1 ) ! ictions 

(0054] i ty be t table | j 1 n i > \ m u rn 

' -ar-pk ofiM 

where spammers often insert random text strin gs in HTML comments between or within words 
of the text. Such e-mail may be referred to as "polymorphic spam" because it attempts to make 
each message appear unique. This method for evading detection might otherwise defeat the hash 
detection teihiuque, oi oth t- ■ mint matching te chni ques. Thus, removing all HTML comments 
from the message before hashing it may be desirable. It might also be useful to delete HTML, 
tag fro try I ie t u • ^a ge, or app ly other s pecialized, but simple, pre-processing techniques to 
remove conte nt not actually presented to the user. In general, this may be done in p arallel with 
the ha h u Q] n tgs lex] ina v i es and worms ma> be hidden in the mm-s i -vMc 
content of the message text. 

(0055} Hash processor I !0 nun d \ha h y y atta tm t ts lit r first ; > tempting to expand 
them if they appear to be known types of compressed files (e.g.. "zip" files) (act 506). When 
hi ;hing an attachment hash processor 41 0 ma> perform one or more com nt tonal hashes 
covering one or more portions, or ail, of the attachment. For example, hash processor 4.10 may 
perform hash functions on fixed or variable sized blocks of the attachment. It may be beneficial 
for hash processor 4 1 0 to perform multiple hashes on each of the blocks using the same or 
different hash functions. 
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[0056} Hash processor 410 may compare the main text and attachment hashes with known 

i ins, or spam patent in a hash s i > tha n loaded vs iformaii from 

known, viruses.,. worms,. and. spa n) o>mem (acts 50S an d 510),.,,Jf there are any. M&Jn this, hash 

i s \ 420 ha roj t thaj _ ! J ! • >.___. i sj try pam 

, i vii polymorph; i nay ha on • a small nun bei f hashes that mate! 11 tsh 
memory 420, out of the total number of hash blocks in the message. A non-polymorphic virus 
may have a very high fraction of the hash blocks hit tn hash memory 420 hoi thia re 
storage locations within hash memory 420 that contain entries from polymorphic viruses or 
worms may be gn e r» mon 5 :ght during the pre-loading process, such as by giving them a high 
initial suspicion count value. 

[00571 A high fraction of hits in this hash memory 420 mav cause the message to be marked 
as a probable known virus/worm or spam hi this case, the e- n i . t on be suicltaekeci 

feremedialac.^ 

{0058} _ ifcssage with a sign if ml "stoic" from polymorphic virus/worm hash value hits 
may or may not be a virus n orm in? j nice, and may be sidetracked for father investigation, or 

to determine the level of suspicion- 

(0OS9J For example, hash processor 410 may hash a concatenation of the From and To 
header fields of the e-mail messa g e (act 512) ( Fig. SB). Hash processor 410 may then check the 
suspicion counts in hash memories 420 for the hashes of the main text, any attachments, and tlte 
concatenated From/To (act 5 14). Hash processor 410 may determine whether the main text or 
attachment suspicion count is significantly higher than the From/To suspicion count, (act 5 1 6). If 
so, then the content is appearing much mote ttoquemh otrHdi tiu <> iges bei n tins set ol 
users (whicl gl v due to an e-mail i t -> age quotations) 

and, tin is muc j ore si > ci 

[0060) When this occurs, hash processor 410 mav take remedial action fact 518). The 
remedial act i j r amrnabte i.pj ^determined b^ 

an operator oi mail Svt sot 20 ^ tvn ip | d the e mad Ih»s 's 
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not recommended for anything botxiitualh-iesti " H s . v , M g icntilkadon. sue! 

perfect match to a known virus. 

[0061 1 As an a ln s n ate te Jmi que, .hash, proce ss' > U.0ma n I the e-..matl..>.N..ith..a\\atmng.m 
the messa ge body , in an addi tional heade;, or othe i n s er jj sib] e. j mot m c • < ; _at id. aliow the user to 

1 t 1 Sit * _ i J.c <J hat appc t_ "t horn an unknown m ailing list, a 

variant of this option is to request the user to send back a reply message to th e server, classifying 
the suspect message as either spam or a mailing list. In the latter case, the mailing list source 
address can be added to the "known legitimate mailing lists" hash memory 420. 
100621 As another technique, hash processor 410 may subject the e-mail to more 
so phisticate dj Ivji . • -turning) detection algorithms to make a more 

certain determination. This is recommended for potential unknown viruses/worms or possible 
detection of a polymorphic virus/worm. 

100631 As yet anot her technique, hash processor 410 may hold the e-mail message in a 
special area and create a special e-mail message to notify the user of the held message (probably 
including From and Subject fields). Hash processor 4 1 0 -iv ;, ,t/ - < „ ; ..».;> tmc^o:^ o n how to 
.rgM^yg. tfjgj»essage. 

{'0064] \s a In ler let gue, 1 tsh puxessor 410 may mark the e-mail mr yjjj it 
M,sr>: cK>n score result, but leave it queued for the user's retrieval If the user's quota would 
overflow whe n a new message arrives, the score of the incoming message and the highest score 
of the queued messages are compared. If the highest queued message has a score above a 
settabie threshold, and the new message's score is lower than the threshold, the queued message 
with the highest score may be deleted from the queue to make room for the new message. 

Ot.hu v. i-e, U k n m h thove tl ilui -.hold it ma> k 1 ] < ^ bounced" 

(e.g.. the sending em ) iii_s u ' is told to hold the m . age and i try it. later). Alternatively, if it 
.is destred to nevet bounce incoming merges, mail setsei 120 may accept the incoming 

1 * g * MlspKKin ^UHC 

from the que ue until the total is below the riser's quota ag ain. 
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|G06S| As anothct tech \ n 1 i 0 may apply hash-based functions as the e- 

i >' ingj n th ic < x 'p \ rmj ji i < 

inerementa.Uy as the. message js.iL.adj n i tju igssag »sji.bigb lough.susffi cion score 
(above a threshold) during the early part; of the. lrses -., mail servei 1 20 i •> > ■ . \ i eject ..! he .message., 

rjsi lally with eith atry late >r /> n i >iu mi - resul t. ending sei erj Inch 

one is used may be determined by sellable thresholds app lied to the tota l suspicion. score,. and 
possibly other factors, such as server load). This results in the unwanted e-mail using up less 
network bandwidth and receiving server resources, and p enalizes server s sending unwanted mail, 
eg] I - c to those that do not, 

f 0066 J If the su; \ u y c ■ ; :u for the main text or any attachment is not significantly higher 
than the Froj J c sj u ion count {act 516), hash processor 410 may determine whether the main 
text or any attaehme-n s igniiicant re} e ed content (non-zero or hieh suspici on count values 
for many..hash. blocks ^ 

420) (act 520) (Fig. 5A). If not, the message is probably a normal user-to-user e-mail These 
types of messages may be "passed" without further examination. When appropriate, hash 
processor 410 may aiso record the generated hash values by incrementing the suspicion count 
value in the corresponding storage locations in hash memory 420. 

|0O6?J If the nit > u it ; tally replicated (e.g... greater than 90%).. hash processor 

41.0 may check one or more portions of the e-mail message against known legitimate mailing 
lists withi n I ' ' > 420 (act 522) (Fig. 5C). For example, hash processor 410 may hash 
iht T mm oi S nek v_< _ i hee ru message and compare it/them to known legitimate 
mailing Hst thii -.'i memory- 420. flash processor 1 10 m a;. . k -no whcthei the e- 
j_ t j ? ap peal f rom t rect soi rtl 1 _ i i i t v run < t 

example. tlu . -' ! Recen ed headers 1 l,i- u . , s i 0 .rmv /further examine a 

combination of the From or Sender fields and the recipient address to determine if the recipient 
has previously received e-mail from the sender. P--- s *\ pj„,f 1 ; $ I aty pical of 

unwanted c-n u-.i L u jv eh will normally n ss to the actual list c j i fu the 

mai ting list. Failure of this examination may simply pass the message om but mark it as 

28 



hi re. U.S. 10/654,771 



(Changes made to 10/251,403 to create 
ClPapp 10/654,77! 



''suspicious," since the recipient may simply be a new subscriber to the m a i ling l ist, or the 
mailings may be infrequent enough to not p _ , i the hash counters betwt n mailings 
[6068] jf.there is ..a matcjh.vyt h ; t itimat n ilingji > 52 ' hen the message is 
E 1 ! Jetut 'l j > yj ue a.i y be pa d wuh fitrtl nisi j is 

assumes thai the m i\mz hst s employs some kind ol tiltes ng to * h dj m van ted e matj 
(e.g.. refusing to forward e-mail that does not originate with a kno wn list .recipient or .refusing e- 
mail with attachments). 

[0069] if there is no match with any legitimate mai i u a remot> 420, hash 

processor 410 may hash the sender-related fields (e.g.. From, Sender, Reply-To) (act 5261. Bash 
: v >r 4 i 0 rnav \ determine the suspicion count for the sendei t 
memories 420 (act 528). 

[0070] Hash processor 410 may determine whether the suspicion counts for the sender- 
related hashes are sim ilar to the suspicion count/s ) for the main text hash(es) (act 530) (Fig. 3D), 
if both From and V k is are present then ( e Senck l ie id. should match w vh ,,>i lT the 

saj ue { \ h n c ount Ju*_ s t!u m es? age body hash. The From field may or may not match. 

For a legitim ate mmling h-i -t m;r v "e a l egitimate . mailing Ji^ 

l egitimate mailing lists hash memory 420 (or in the ease where there is no known legitimate 
mailing lis ts hash memory 420). If only the From field is present., it should match about as wel l 

as the nw - .. mailing lust. If none o ' ■■• . le - dated fields match as well as the 

message text, the .e-mail message may be considered moderately suspicious (probably spam, 
with a variable ant i Tic-tit io i : < mm address o? the like) 

[0071 J As an additional check, hash processor 4 1 0 may hash the concatenation of the sender- 

' 1 1 ■' pnum count %a ue and the e-mail recipient's address (act : 532). 

Hash processor 410 may then cheil cue m p s cion count for the concatenation in a hash memory 
420 used just for this check (act 534). If it matches with a significant, su^ (act 
536? (Fig. 5 Ft, llu . io n tin-, mhiiu', 

which makes it probable that it is a ntaiiing.list.T.h 
fuuher exam 
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{0072| If the message text or attachments are mostly replicated {e.g., greater than 90% of tire 
hash blocks), has w h mostly low suspici i h hk t ' M, then the 

message is probably a cage o + im aj soak re] li , i « ssage to multiple recipients 

In this c ase, the e-mail mess a ge may then he pa ssed without further examination. 
[0073} If jihet«e§§agg.igxto) angilanc-nts >, ■ c sec ol content 

re plica don (say , gi ea tea J < . v - ^ a hie^ have 

high suspicion count \ahns i ,„ iseo ; 12(3 (act 540), then the m - ; , s> faul\ likely to be 
a virus worm or spam. A virus or worm should be cons idered mo re likely if the high-count 
matches are in an attachment, if the highly-replicated content ts in the message text, then the 
. to be spam, though it is possible that e-mail text employing a scripting 
language (e.g., Java script) might also contain a vims. 

j0074| Iftl it on is m the message te\t and the suspicion count is substantially higher 

for the message text than for die From field, the message is likely to be spam (because spammers 
general!) ' mpler spam fil i e made foi 

the concatenation of the From arid To header fields, excep t that in this case, it is most suspicious 

; ! i i n c 1 ! i ej ! , h i tin in< i in !l i e set it 

ordinarily send e-mail to that recipient, snaking it unlikely to be a mailing list, and very likely to 
be a spammer (because they normally employ random or fictitious From addresses). 
|0075| In the above cases, hash processor 4IO may take remedial action (act 542) , The 

1 u taken by hash processor 4 1 1) may v at » - lescj a! abo\ e. 
[0076J pac ket's pay load field (act 510). Hash proces^^HO - -R^y-tt^-a-€tm¥ef^4eftal 

m ay be p^ I f 
eft e -ey-wior - fr-ef-tfeft-generated hash valued match one of the hash values of known virases-aiiH&ey 
woHn%4*a s43-pr^ ^ 
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n ^ y- include rai s -^ 
f e i f #m * g4ftm>a^^^ 

possibly ether -packets- originating- from th e s ame Internet Protocol- f IP) address as ft e paefcet^ 

(and likely to -cause th e r e ceiv e r to drop the packet). 
tf-th e gea 

deteen t n e w n e th e r th e pack e t's soore e address ind i cat e s that th e pack e t was sent - feetn a 
legitimate source of duplicated packet content (i .e., a legitimate "replicator" ) <act530). For 

and cheek the source address of the packet with the addresses of legitimate replicators on the list: 
If th e' packet's sourc e addr e ss matches th e addr e ss of one e f < he l e gitimate repl ica tors; ■ then hash 
|>foe e& S0f"^4 - O"H>a - y"eHd"pr - 6ee ^ g -ing of t h e packelr^r-«x<^jpleyffoee8sk^-fla8y"r«tt»»-te-ftet-S0S- 
and r -await ? e c e ipi of the next packet: 

Qtfeerwise r has^ 

generated hash value(s) as an addr e ss into hash m e mory 320. Hash proc e ssor 310 may th e n 
e x-aiBiftfr-mdicator field 412 (FIG. 4) at each address to dete^me-wfeefeeHbe^fte^e^«H»re^^ 
stored th e rein indicate that a prior packet has be e n rec e i ved. 

tfth e Few e f^^ 
t -e eor#^^^ 
■34&4 ra> y-- s e4~#K ^ er re HW-^^ 
genem te 44ia s- l^^ 
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3 4&rPt^e s shtg-ffl ^ ^ 
If hashpreeesser 3 
pTO€es ;K>iKM#4Hay -4j^^ 

with the same hash value have to b e observ e d by hash proc e ssor 310 b e fore the pack e ts ar e 
ld e Mifi€ i d"as-j>eteitjftHy"mftHe - k>i ts- . - "Th e "f»ks might al s o specify that these pack e ts hove to have 
be e a-oh s erved '' ^ 

mtdtiple pack e ts will lik e ly pass through pack e t d e t e ction logi^ 
A - pae& e Hftay - ^onta^ ^ ^^ 

packets; Fof exontple^ Q packet tha t includes multtpl e 4m s l > 4> l eek- !jr -may have somewhere between 
on e and all of its hash e d cont e nt blocks match hash blocks associ a ted with prior pack e ts. Th e 

tfrat ft ee tl to - inateb ^b^ 
snt4i - #fr^ ^ ^ 

predetermined number of th e packet blocks with the sam e hash values ar e obs e rved or when the 

pae-ketsare o b s er v ed outside the specified period of ttme,-4ia3 i ^pre€«3 s ^ 

gen e rat e d hash va.l»e(s) in hash m e mory 320 (act 540), For exampl e , hash proc e ssor 3 10 may set 

vate s -rtolndka^ 
Pfoe e ^agH S B « y-fe 
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may ^ e-r^edi ^ 
fe - p a ek e t4s - #t4^ 
a legitimate repfe 
fxic^et-aetnaiiv^&ift^^^^^ 

human -analysis, dropping th e pack e t, corrupting the pack e t content in a way lik e ly to render any 
eode-^emaffl e d-thgfeffl-ifl e ft^fflid - ^k -e iy- t o - cau a e t he r e ceiver to drop the packet), delaying 
transmission of the 
fete. * pp i ng^ 

messag e to the s e nd e r th e reby pr e v e nt i ng compl e t e transm i ssion of the packet; and/or 
disconnecting the link on which the packet was received. Some of the remedial actions, such as 
dK>ppk * g - o* - c - 8fmpfe 

tmlioious-iS ' abovesome-rfiire s boyi-l^is may greatly slow the spread rate of a virus or worm 
without compl e t el y stopping legitimat e traffic that happ e n e d to match a susp e ct profil e . 

EXEMPLARY PROCESSING FOR SOURCE PATH IDENTIFICATION 

F-K4/-6--is-a--^ 

th e principles of the inv e ntion. The processing of FIG. 6 may b e p e rform e d by a s e curity server, 
suefe-afr-seewity server 125, or other devices configut-ed^faoe^t^^-^^-^"^^^^ 
packets. In oth e r implementations, one or more of the d e scrib e d ac ts may b e p e rformed- by other 

Pfoe es tktgHtf^ I n t mfe 

de4 e €4ien-- s y^m42^^^ 
e i * a ft * p k ^4n t fndef - 4 e ^^^^ 
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paftof-an-abiMr^ 
d e t^4k » n -sys a e B ^^^^ 

within -autoaemeas- system 4 20v Th e notification ■ may ■ imMe4hQ'm»UeiBm'f)mk^'m'P@s^&m 
thereof -a jeng-wiliv^^^ 

information, link information, and the lik e . 
Alter-r- e e e ivin^ 

participating rotif e rs, s uch a s ■s ecitrityTO ; atBT ; s-}24"129 ' (actS ' 6Q5"a i Hl-61 : 0>"¥vxaiBples--of 
additional information that may be included in the query are, but are not limited to, destination 
addf e ss e s4e? - fHH - tk4^ 

■H-HWnmtK>n time - to - iiv'e and the like: 

S e curity serva-"l'2§"iinay'then'$end"the"qaery"tO'Seeur'it>' rout e rfs) l ocat e d on e hop away (act 
%4-£) : --4-he see«tf& yH? ettte*foO^^ 
mal t ekftHr p ack et: ^ 

response may indicate that th e security rout e r has se e n the malicious pack e t, or alt e rnatively, that 
k-has-notv-lt is important to observe that the two answers are not eqtiaHft-d^f-degfe e -el 
c e rtainty. If a security rout e r does not hav e a hash matching th e malicious packet, the -security 

how e v e rrtnea^ 
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m»tef(s)-je€ate44l«-ee4H>f>S"awayr-ftiMl so on. Th js ^^vardingmay^ 
dev4e ^s - 4¥rtMn - pui^^ 

out approach -because th e qu e ry travels a path that, e xt e nds outward from security s e rv e r 125. 
A-k^m#veiy;-aft-OHtwaf4"»»-ftpproach may be used. 

S e ew4 t y -- ^^w^^ ^ 

security neut e rs have se en th e ■ «)alicjo« S' pai-k e t|ac^"630^nd ' 625):4fa ' T e spo»s e' iadicat e s4hat 
the security router has seen the malicious packet, security server 125 associates the response and 
id e Milkat^ 

Alternatively, if t he respoas e -H^icate:! that the security router has not seen the malicious packed 
s e eurity ■ server 425 associat e s th e r e sponse and th e JD informa t ion for th e s e curity rout e r ' with 
inactive path data (act 635). 

S e euf4 t y --se i^ e r4^ 

tak e n-by-tln^^^ 

sef¥ef433Haaay^^ 

routers (acts 640 and 6 - 15 ). S e curity s e rver 125 may att e mpt to build a trac e with each receiv e d 
respometo determine the ingress point for the malicious fa^feetr4%e4«gfes»fet«f^ay-t^»^ 
where th e malicious pack e t e nter e d autonomous syst e m 1 20, public network i 50, or another 

fMths-^ a y- e me ^ -as-- a -r e stdt--of-k> s fe 
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eee*refene e« ^8£j^^ 

to -compete --large -hash -valu e s -over th e pack e ts sinc e th e chancer <?f-e^M^e»»-«6e'«»^e'-a«t«ber 
t>Pblts €6:»iftrisj-ftg the hash value decreases. Another mechanism to reduce fal s e-positives 
r^ w l teg4%om -- ee^^ 

A-fetfogr-m 
ffl e fflo» e »^f 

a -s ing l e -bit for an ob s erved pack e t; a plurality -of hash va i ti es may be comp n t e d - -each 
observed packet using several unique hash functions. This produces a corresponding number of 

rate^ the reduction in the number of hash collisions makes the tradeoff worthwhile in many 
kstanee s r For e xampl e . Bloom Filters ma y b e us e d to compute multipl e hash vahi e s-ov e i' a giv e n 

W-iK^s^uraty--se^^ 

(act 650). Security serv e r 125 may also take remedial actions (act 655). Often it will be desirable 
te-feav e -fehe participating router closest to the ingress point-«lese-e#tbe4«gpe^^ 
malicious packet. As such, s e curity server 125 may s e nd a m es sag e to th e respective 

tfafr4&e-eitl ^ 466 ^ 
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network: -R>r-e ^ n^^ 
ep e ra*kw ^- € e n^ 

EXEMPLARY PROCESSING FOR DETERMINING WHETHER A MALICIOUS PACKET 

FIG. 7 is a flowchart of e xemplary proc e ssing for d e t e rmining wh e ther a malicious Vpacket, such 
a s -a -viftB -&r-w-OftB r l>as been ohs e rv e d - aeeording to on implementation consistent with the 
prffle4p4es- ' #f{j^ 

eenfigw e d to trac e th e patfos tak e n by ma l iei ous packet*. In oth e r impl e m e ntations; on e or mere 
of the described acts may be performed by other systems or devices within system 100. 

Processing ' may begin when security router 126 receives a query from security server 125 (aet 
705). As described abov e , th e qu e ry may includ e a TTL fi e ld. A TTL field may be e mploy e d 
beeatt9e4t-pr<avMes-^*ef%^ seenr i t - y - router-FesjDO - fKis-ea l y - to 
g e levaaVorlim e ^ 
t f a v e f fr ing- t l^^ 

'wi#>-expifed-4'T-l-j-l : klds-fHay-fee discarded.- 

If th e query includ e s a TTL fi e ld, security router 126 may d e t e rmin e if the TTL field in the qu e ry 
hafr- e x-pif-ed (act 710). If the TTL field has expired, secidity-i^^ 

(act 715). If the TTL fi e ld has not e xpir e d, s e curity router 126 may hash the malicious packet 
e^at»«ed"w i tfe ^^ 

affp & af -- atr^ 
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Se^nity-imtta--^ 








to address into hash memory 3 20- At each -of the addresses;; -seonr-iry 




whether indicator field 412 indicates that a priof-paek-frt-with-tfee-same 
¥ed:4f-ttene-ej^fe-g^ 







negative respons e to security server J 25 (act 735); 
s e ew4ty - *e * tf e r^ 

the direction from wh i eh th e query was rec ei ved ■ (act 74 0): Sec urity router 126 may a be -s e ed- a 
positive response to security server 125, indicatirtg that the packet has been observed (act 745). 

packets that have passed through security -router 4 26^ 

CONCLUSION 

Systems and methods consistent with the present invention provide mechanisms withiti an e-mail, 
server to detect and/or prevent transmission of unwanted e-mail such as e-mail containing 
viruses or wo m nc ndmg poly mogmic viruses \ ; tis., and > i 3 uual e-mail 

(spam). 

(00771 Implementation of a hash-based detection mechanism in an e-mail ...server at the e- 
J ' 1 1 1 1 ^ ( ' 1 packet-based" in i'un t it \ m a router or other 

riciH.'ik node dcx-'. ? or example, the entire e-mail message has been re -assembled, both at tire 

packet level (i.e., IP fragm ent re g . and a; the application Sesei (multiple packets into a. 

i pies nail n 1 Vho. the hashing algorithm can be applied ihgemly to 

pet ific p; >!' ! iM dm sag . header i Ids, m sgc bod) md f v hn e tj } 
meats that 1 } n_ t 1 for transp _ ip can 1 j ( 
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inspection. Wi 1 i > i i with no 

icpeatabv hash signature vi ramp m le\el 

[6078] Withjhje entire me^^ to; « .i . I pa; ofjhejha* ina process. p acket 

boundaries and packet to^n.^i.t^i '-.'M iA- <<os -■;■!<! ■■.cq-icniC'. t i 1 \ tes that might otherwise 
prj dt sefui tasj ^naiiac \ - 1 tracker might < iei < ' a vinus m attach 
fay causing the IP packets carrying the malicious code to be f ragmented into pieces smaller than 
that for which the hashing process is effective, or fay forcing packet breaks in the middle of 
otherwise- visible fixed sequences of code in the virus norm. Also, the entire message is likely 
to be longer than a single packet, thereby reducing the probability of false alarms (possibly due 

r _ i o ient 1 ( fa 1 rj < 1 ii j cks p r pad tiid increasing ih 

1 § i block^.Mll.mtck^r.mg§sagg, 

i ' > ( > s heunh paits of the entire messag e). 

Also,.iewer.hash-block..d 
intelligently alie nee th fit h s ■ he - mail menage, such a s t he s t art of the message body, or 
the start of an attachment block. This results in raster detection of duplicate contents than if the 
blocks are lando'f I i s applied n> uuhudual pa- . • 

[0080] E-mail-borne maiicious code , such as viruses and worms, also usually includes a text 
message de signed to cause the user to read the message and/or perform some other action that 
will activate the malicious code. It is harder for such text to be polymorphic, because automatic 
scrambling of the user-visible text will either render it suspicious-looking, or will be very' limited 

in \ a; lability fh fa it combined with die ability to stag a hash blo ck at the start of the 

me tgej fa nsj «_ v <_l header, reduces the variability in hash signatures of the 

i tee makin g 1 i t etet r_e> arsi| 'en 

{0081 J FiiiilKi a i > extt ndj pecific he Hum an e-maii message 
^cp natch iv « e _ d to help J>sify the type of replicated content the message body carries. 

! ny legi >a >e fjVK g he 1 i us! s 5 ic unling lists. s( ch < 
Yahoo Groups), intelligent parsing and hashing of the message headers is veiy useful to reduce 
e fal i y \ , aj d f > t < rea ; ^ ... ura A (k ction of .i i ruses >rn i i p in 
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{0082| This detection technique, compared to others which might extract and save fixed 

1 ! i v 1 y t i i hash 4) 1 filtei !< t pj ; 

ns (,i e fc it is possible i" i f n *te si it iui sMe«.se!en,befOTe,in 

another m :ualSy impossible 

consu i | i my pict f a pi ox lg ti 1 t b tvpa J t • > tgh 1 

filter previously. Thus, this technique can maintain the p rivacy of e-mail, without retailin g, any 
information that can be attributed to a specific sender or receiver. 
100831 &£Kl4faee4hei^ 

The foregoing description of preferred embodiments of the present invention provides 
illustration and description, but is not intended to be exhaustive or to limit the invention to the 
precise form disclosed. Modifications and variations are possible in light of the above teachings 
or may be acquired from practice of the invention. 

For example, systems and methods have been described with regard to a mail server, ft elswrffr- 
lev e l devic e s. In other implementations, the systems and methods described herein may be used 
within otha dev i ; 3 a mail client. In such a case, the mail client may periodically 
obtain suspicion count values for its hash memory from one or more network devices w&h-e 
» te*^a4e» e < l e ¥ ^ 
m 41-relayhosts{e.g^ 

WM e ~ser4e s -of-ae te -te 

addition, non -dependent acts may be performed concurrently. 

Further, c e rtain portions of the invention have be e n d e scribed as "logic " that p e rforms one or 
mefe^tmetreaar^fe^ such as a mail server. 
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1 0084 1 It may be possible for multiple mail servers to work together to detect and prevent 
m] ij > i i ex an h ni ti m > I » 1 mer n 1 i_ L i_ 1 e 

m.ighLbedismfa 

-a>!k- A -.H.j v, jM u : ' s ,e: - f Si; - p- ,.v , cceierate the detection proce is. especially for mail servers 

expe ce relative!} v _U___ 1 ! - 
Further, certain portions of the invention have been des cribed as " blocks" that perform one or 
more functions. These blocks may include hardware, such as an ASIC or a PPG A, m 



combination of hardware and software. 

No element, act, or instruction used in the description of the present application should be 
construed as critical or essential to the invention unless explicitly described as such. Also, as 
used herein, the article "a" is intended to include one or mote items. Where only one item is 
intended, the term "one" or similar language is used. The scope of the invention is defined by the 
claims and their equivalents. 




^software, or a 
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